Questions and Answers :
Windows :
Computation error when BOINC halts
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Oct 18 Posts: 8 Credit: 1,667,803 RAC: 3,199 |
Hello, I'm running BOINC manager 7.22.2 (x64), with a handful of projects running. I have had a couple of climateprediction.net tasks complete recently but I've also noticed that if I have to reboot my computer or shut down BOINC for any reason with a climateprediction.net task in progress, when I restart BOINC the task status changes to computation error. Is there a setting I need to check or something to keep this from happening? It's happened a few times recently and it's kind of sad because I have had a few days of work done on those tasks when they failed out with the computation error. It's only the climateprediction.net tasks that are failing with the computation error whenever BOINC is halted and restarted, all tasks from other projects seem to be able to pick up where they left off when BOINC restarted or the computer was rebooted or whatever. Thanks for any info or advice... |
Send message Joined: 15 May 09 Posts: 4529 Credit: 18,661,594 RAC: 14,529 |
I currently have 8 tasks running and have been shutting down at night without losing any tasks. What I do is suspend computation for each task, wait about three minutes then exit BOINC wait another three minutes before shutting down. My situation may not be completely analogous though as I run my Windows client under WINE on a Linux box. As other moderators could tell you, this is not foolproof under either OS but does improve the odds. |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,673,031 RAC: 4,752 |
There is a problem with a batch of CPDN tasks which affects some, but not all users - lots of discussion on their "number crunching" forum https://www.cpdn.org/cpdnboinc/forum_thread.php?id=9149 It is particularly prevalent with tasks starting with "WAH2_EAS2", but may not be limited to them. Taking care to shut down BOINC (not just close it) before shutting down the computer does appear to improve the situation a little. (Windows - right click on BOINC in the taks bar, select "EXIT", the "stop all running tasks".) |
Send message Joined: 17 Oct 18 Posts: 8 Credit: 1,667,803 RAC: 3,199 |
Thanks I think I'll try suspending the tasks before closing, although that doesn't help with automatic reboots for windows updates... Oh well hopefully it'll get fixed someday. |
Send message Joined: 6 Aug 04 Posts: 195 Credit: 28,193,230 RAC: 10,406 |
Is there a setting I need to check or something to keep this from happening?Pause 'Windows Update' for the maximum amount of time, that stops Windows annoyingly doing an update while CPDN tasks are runniing. 'Resume updates' after the tasks have fnished to keep the OS up to date, and then pause again. |
Send message Joined: 15 May 09 Posts: 4529 Credit: 18,661,594 RAC: 14,529 |
Is there a setting I need to check or something to keep this from happening? Block the Microsoft domains in your router is how I would stop random updates. |
Send message Joined: 29 Oct 17 Posts: 1044 Credit: 16,196,312 RAC: 12,647 |
Thanks I think I'll try suspending the tasks before closing, although that doesn't help with automatic reboots for windows updates... Oh well hopefully it'll get fixed someday.The problem is caused by the size of the checkpoint files that the task needs to do a restart. They are larger than other projects. If the task is writing to those checkpoints when the machine is suddenly shutdown the files are not written correctly and the task can't restart. It's just bad luck that sometimes the shutdown happens when the task is at the point of writing those files. I would recommend turning on 'Leave non-GPU tasks in memory while suspended', under 'Disk & memory' in boincmgr. This will stop the task having to restart from checkpoint files if it's suspended for any reason (not a shutdown). I also agree turning off/suspending Windows auto-updates helps too. I do the same for these long running CPDN tasks. --- CPDN Visiting Scientist |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,673,031 RAC: 4,752 |
Suspending and halting (stopping) are not the same - The safer option is to halt the processing, which forces the "resume" file to be written instantly to disk; suspend on the other hand may not even produce a resume file (worst case), or will defer its creation for some time. As for Windows automatic updates - they are an absolute pain, and should be blocked - others have suggested ways of doing this. |
Send message Joined: 29 Oct 17 Posts: 1044 Credit: 16,196,312 RAC: 12,647 |
Suspending and halting (stopping) are not the same - The safer option is to halt the processing, which forces the "resume" file to be written instantly to disk; suspend on the other hand may not even produce a resume file (worst case), or will defer its creation for some time.That's not necessarily true. boinc can't 'force' the model to write the file. Writing the file is under the control of the model, not boinc, and the OS takes responsibility for flushing the file to disk. I know this from coding up the OpenIFS model to work under boinc. The MetO models work the same way. |
Send message Joined: 6 Aug 04 Posts: 195 Credit: 28,193,230 RAC: 10,406 |
As for Windows automatic updates - they are an absolute pain, and should be blocked - others have suggested ways of doing this.I fully agree for experienced CPDN users. However, more than 50 years of computing experience tells me that there are many many users where forcing OS and application updates, especially security updates, is an imperative. |
Send message Joined: 22 Feb 06 Posts: 490 Credit: 30,766,944 RAC: 10,886 |
As for Windows automatic updates - they are an absolute pain, and should be blocked - others have suggested ways of doing this. This can be done using group policies in the registry so that Windows has to ask you to do the updates - if you feel up to it. Search the web for details. |
Send message Joined: 9 Nov 20 Posts: 6 Credit: 6,907,448 RAC: 3,441 |
Getting this myself, it looks like the WU's will take around 20 days each for me, its hard to run Windows and not reboot in that time, is there any solution yet? |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,673,031 RAC: 4,752 |
A couple of things - The initial estimates of task duration are often very pessimistic, but in time they get a bit better. Do a quick calculation yourself of the time left to run, once the progress has got beyond about 10% your "guess" will be a lot more accurate than BOINC's. Second, as these are recent tasks the will belong to the 966 batch, there's a thread running about a couple of issues with these tasks, but no solutions have arrived yet (apart from not shutting down to avoid the "fails on restart" type error, which is a real pain for those that suffer Windows forced reboots, or power cuts, or shut-down at night, or suspend to do something else...). https://www.cpdn.org/cpdnboinc/forum_thread.php?id=9222 |
©2024 cpdn.org