Message boards : Number crunching : Computation error, newly added project
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
All tasks are immediately ending with computation error. This is on both windows and linux hosts. The windows host uses tthrottle and the linux is at 50% cpu time. My next step is to disable/100% these. The delay to update the project is so long I'm having a hard time troubleshooting. Any other ideas? 12/25/2019 12:27:20 PM | climateprediction.net | Starting task hadam4_a1tw_209810_6_856_011964087_1 12/25/2019 12:27:21 PM | climateprediction.net | Computation for task hadam4_a1tw_209810_6_856_011964087_1 finished 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_1.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_2.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_3.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_4.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_5.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_6.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_restart.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent 12/25/2019 12:27:21 PM | climateprediction.net | Output file hadam4_a1tw_209810_6_856_011964087_1_r489071904_out.zip for task hadam4_a1tw_209810_6_856_011964087_1 absent |
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
And now I've been put in time out making troubleshooting even harder. One host gets this, another gets no message whatsoever. Just zero tasks. 12/25/2019 1:59:12 PM | climateprediction.net | Requesting new tasks for CPU 12/25/2019 1:59:19 PM | climateprediction.net | Scheduler request completed: got 0 new tasks 12/25/2019 1:59:19 PM | climateprediction.net | No tasks sent 12/25/2019 1:59:19 PM | climateprediction.net | This computer has finished a daily quota of 1 tasks |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
On your 2600X, you don't have the 32bit libraries loaded that climateprediction.net needs. Sticky at the top of the Linux forum. https://www.cpdn.org/forum_thread.php?id=8008&postid=59939 Since you're using 19.2, it's based off of Ubuntu 18.04 so use that command in the sticky to get the needed libraries. Edit...also, you only have 8 GB of RAM on the 2600X with 12 cores that boinc sees. This is problematic in terms of memory usage if running on all cores, and I would suggest limiting the number of CPUs used by boinc to at most 6 for the hadam4 N144 models, or at most 4 if you get the hadam4h N216 models. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Also, the "error messages" in your first post aren't. They're just BOINC saying that it can't find the output files to upload. Which is obvious if the model crashed before they were created. The real error message(s) will be before that in the list. |
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
Added the linux packages. I assume I have to wait 24 hours to expect an update. Any ideas on the windows host? There are no other errors in the logs. 12/25/2019 11:11:10 AM | | Using account manager BOINCstatsBAM! 12/25/2019 11:11:10 AM | | Setting up GUI RPC socket 12/25/2019 11:11:10 AM | | Checking presence of 356 project files 12/25/2019 11:11:10 AM | | Suspending GPU computation - computer is in use 12/25/2019 11:11:11 AM | climateprediction.net | Sending scheduler request: To fetch work. 12/25/2019 11:11:11 AM | climateprediction.net | Requesting new tasks for CPU 12/25/2019 11:11:13 AM | climateprediction.net | Scheduler request completed: got 0 new tasks 12/25/2019 11:11:13 AM | climateprediction.net | Not sending work - last request too recent: 703 sec 12/25/2019 11:11:21 AM | | Suspending computation - CPU is busy 12/25/2019 11:11:31 AM | | Resuming computation 12/25/2019 11:12:51 AM | | Suspending computation - CPU is busy 12/25/2019 11:13:01 AM | | Resuming computation 12/25/2019 11:13:23 AM | climateprediction.net | Computation for task wah2_anz50_a0jx_200612_31_860_011978332_0 finished 12/25/2019 11:13:23 AM | climateprediction.net | Output file wah2_anz50_a0jx_200612_31_860_011978332_0_r671891042_1.zip for task wah2_anz50_a0jx_200612_31_860_011978332_0 absent 12/25/2019 11:13:23 AM | climateprediction.net | Output file wah2_anz50_a0jx_200612_31_860_011978332_0_r671891042_2.zip for task wah2_anz50_a0jx_200612_31_860_011978332_0 absent ... |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
It's possible that the "day" will reset at 00 GMT, so it may ask for work after that. The Windows errors are odd, with the "cannot find the device/drive specified" problem. I've had these types of errors occasionally, but not in bunches like that. It may have to do with the system trying to do too much disk reading and writing simultaneously, but that's just a hunch. If those types of errors continue frequently, I'm not sure what the solution would be. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Perhaps the errors are something to do with Windows' VirtualBox. Also, the models are taking 13 hours to clock up 40 minutes of model time. |
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
geophi, "cannot find the device/drive specified" - what are you referring to? Les, where are you seeing that result? And CPDN uses virtualbox? I don't see a VM created by it. If you're seeing something on the back end, I did attempt attaching a linux VM to replicate the problem fresh but am still not receiving any tasks. I do see that presently the only work available is for one of the linux applications, so I'm suspending the windows host. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
geophi, "cannot find the device/drive specified" - what are you referring to? Your Windows computer's tasks are here https://www.cpdn.org/results.php?hostid=1496463 If you click on the individual task number, you'll see a section labeled stderr where some additional errors are written. The problem up near the top of that listing is what resulted in the task error "The system cannot find the drive specified." |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
About virtualbox: I'm referring to the info about your computer, here under Virtualization cpdn DOESN'T use virtual box, but you have it installed and enabled. |
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
Ok, thanks. I've suspended all but the 2600x linux host. It's limited to 1 core, 100% cpu time, and run-always. It's 0030 GMT and 30 minutes left until my next update. Cue suspense. |
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
bah humbug. 12/25/2019 6:58:12 PM | climateprediction.net | This computer has finished a daily quota of 1 tasks |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
:( I've always thought that the "midnight" was where that project's server was. Perhaps it's where the person's computer is. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
I run some of my Linux boxes on UTC/GMT time, so that may be why I remember that. |
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
Got a task on the linux host! So that was the libraries - thanks! Only letting it crunch the one task for now to test and I expect to see that run time come down. Plus this rig is going down tomorrow for hardware changes as it's a project-in-progress (literally building a winter space heater). Still no work available for windows so have to wait to sort that drive error. It's an m.2 nvme and scores >1k on AS SSD so I don't think it's a speed issue. But I've limited it to one core so when a task does come down I can maybe rule that out. |
Send message Joined: 27 Jul 16 Posts: 10 Credit: 55,923 RAC: 0 |
Windows host got a task! Available work for weather ran out tho, so still need to run multiple cpdn tasks. And boinc is reporting a 10 day run time... |
©2024 cpdn.org