Message boards : Number crunching : Update on HadCM3 'Short' WU crashes with shutdown in Windows
Message board moderation
Author | Message |
---|---|
Send message Joined: 26 Aug 04 Posts: 67 Credit: 10,299,683 RAC: 10,424 |
There are a large number of reports across the thread about the HadCM3 'Short' WU's crashing in Windows if BOINC is stopped. I don't know about earlier WU's but with the current batch of WU's, I am not experiencing this. Last week, I (absentmindedly as I thought immediately afterwards) suspended BOINC and shut my PC down for a reboot after installing a driver for some unrelated software. I fully expected a crash of the 3 running 'Short' WU's on restarting BOINC but they didn't, they carried on running to completion as if nothing had happened. I, intentionally this time, repeated the exercise of BOINC suspension and PC reboot for some Windows updates yesterday and the 2 'Short' WU's, together with an EU AM3 all restarted without a problem. I'm running Windows 7 incl Sevice Pack 1, BOINC version 7.2.42 and the method I use, and always have, is to first suspend the running project via the Activities dropdown which suspends all running WU's simultaneously. I then wait about 30 secs to give a chance for any disc writing to complete, then exit from BOINC. I then shut the PC down. On restarting, I wait until everything has started, then start the BOINC manager. I then restart the project via the 'Activities' window. No crash with 'Short' WU's yet. I haven't tried a drastic BOINC process stop by shutting down the PC with BOINC still running, maybe that would crash the WU's?. |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
What I've noticed, since you posted a week or two ago -- Your machines seem to fail the hadcm3n - r models with the "theta" error -- while almost all machines fail with some Linux or Windows "stack overflow" There's a zillion machines out there that fail these "short" wu's -- but your machines seem to get as far and get to the "THETA" thingy. Cant see how your boxes get so more far forwaarder-- Whaat you have different that makes your machine fail "as expected" raather than the "stack overflow" that most of us seen. |
Send message Joined: 22 Mar 06 Posts: 144 Credit: 24,695,428 RAC: 0 |
Interesting Erik, but not quite what Pete was on about. However to continue your train of thought, I noticed it is an AMD box so went through the top 300 PCs, but no, only found one other AMD giving similar results here. (The CPU run time is a dead giveaway as to the type of error; numbers less than 100 sec normally mean Invalid Theta.) Several others crashing, several hadn't run the model. Then thought I should have a look at some others and found a windows laptop (with more suspends than I've had hot breakfasts - ever!!) and yet was chugging all the 'r' models through getting Invalid Theta. Gave up after that, and I guess the researchers will by now have figured out what is going on. |
Send message Joined: 9 Sep 04 Posts: 228 Credit: 30,756,611 RAC: 3,303 |
with BOINC 7.4.27 , it works A LOT BETTER. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
About time we had some good news. :) Thanks. |
©2024 cpdn.org