climateprediction.net (CPDN) home page
Thread '13 HOUR BUG'

Thread '13 HOUR BUG'

Message boards : Number crunching : 13 HOUR BUG
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 42408 - Posted: 13 Jun 2011, 4:31:18 UTC
Last modified: 13 Jun 2011, 4:32:43 UTC

Looks like I just got bitten by the 13 hour bug. WU hadcm3n_r64h_1940_40_007289888-2 crashed after running about that long. I have the Wu in a backup from before the crash. Is there any way to fix it or should we just move on.

OS is Windows 7 64 bit SP1 running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.
ID: 42408 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42409 - Posted: 13 Jun 2011, 6:09:54 UTC - in response to Message 42408.  

Move on. If you have anything to move on to. :)

Backups: Here
ID: 42409 · Report as offensive     Reply Quote
mweisensee

Send message
Joined: 29 Apr 07
Posts: 5
Credit: 1,961,201
RAC: 0
Message 42410 - Posted: 13 Jun 2011, 6:30:15 UTC

Same for hadcm3n_r2oy_1940_40_007288661_2.
ID: 42410 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 42411 - Posted: 13 Jun 2011, 6:40:32 UTC - in response to Message 42410.  

And hadcm3n_r3sg_1940_40_007290761_2 and the other two tasks in that work unit did the same I see on looking it up. - If other tasks in work unit succeed I try the backup. If not I move on. Not the most scientific way of deciding but I feel it is better than nothing.
ID: 42411 · Report as offensive     Reply Quote
Profiletullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 42412 - Posted: 13 Jun 2011, 7:09:23 UTC

Mine has been running for 19:24:27. I have suspended it because it was running in high priority mode and I have other 5 BOINC projects, including a Virtual Machine by CERN. I have only 2 cores on my Opteron 1210 running Linux.
Tullio
ID: 42412 · Report as offensive     Reply Quote
Profiletullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 42413 - Posted: 14 Jun 2011, 2:23:56 UTC

Errored after 24 hours.
Tullio
ID: 42413 · Report as offensive     Reply Quote
ProfilePinkPenguin
Avatar

Send message
Joined: 26 Apr 09
Posts: 6
Credit: 514,253
RAC: 0
Message 42414 - Posted: 14 Jun 2011, 6:17:26 UTC
Last modified: 14 Jun 2011, 6:18:09 UTC

This is more like a 0 hour bug - the server says I have hadcm3n_r1wj_1940_40_007291053_0 but I assure you I don't. It failed to download over the weekend so it's not there and neither are the files in "download pending". Here's the task:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12971223

You might like to abort it so that someone else can have it.

The other task on my account did download correctly and is running.... so results will come in eventually.
ID: 42414 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42415 - Posted: 14 Jun 2011, 7:38:53 UTC

That's not related to this thread. It's just a 'phantom' task, which most people get sooner or later when the servers are under load.
Just ignore it.



Backups: Here
ID: 42415 · Report as offensive     Reply Quote

Message boards : Number crunching : 13 HOUR BUG

©2024 cpdn.org