Message boards : Number crunching : CPDN crashing on completion
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 20 Credit: 657,542 RAC: 0 |
Hi, folks. 10/12/2007 22:39:14|climateprediction.net|Computation for task hadsm3fub_0361_005911833_8 finished 10/12/2007 22:39:14|climateprediction.net|Output file hadsm3fub_0361_005911833_8_3.zip for task hadsm3fub_0361_005911833_8 absent Does anyone know what has happened here, please? How can it be avoided in future? And can the unit be resurrected? Cheers |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It sounds a little like you may have done an Update shortly after the zip upload to get it to \"Report\". If this message is received too early, the zip might still be on the Upload server, waiting to be transferred to the storage server. Hence the message (paraphased): \"What are you talking about? There\'s no final zip file here.\" If you DIDN\'T click on Update, then I\'m not sure what happened. As recovering the unit, the usual advice applys: Only by rerunning the last bit of the model from a backup made before the model finished. Backups: Here |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
This was the result being talked about: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6981207 I think it crashed just prior to the end? The last successful trickle was phase 3, 248,446 (very near the end but still a few hours to go). And the error message was an exit code 3: <core_client_version>5.10.28</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Not a JPEG file: starts with 0x01 0xda CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... What was happening on the PC at 22:42 yesterday? Where is the model installed, on your PC\'s local hard disk, or on a network / removable disk / usb key / etc? )just guessing from the error message). As Les says, if you have a recent backup, you could resume this model. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 2 May 07 Posts: 20 Credit: 657,542 RAC: 0 |
Thanks, Les & Mike. At 22:42 I did do an Update, but the model had crashed by then. It had come up with a computation error message and a time error. The time to go had stuck at 3 seconds - so nearly there! However, it had been attempting to trickle up between 22:29 & 22:38 and then a new model started at 22:38. Could it be that BOINC started the new model because of a time error on the trickle up and then the crash occurred as a consequence of the 2 models overloading my PC? BOINC/CPDN runs on my PC (Vista with dual-core 3800+). I haven\'t been doing back-ups and the PC had been running for 6 days continuously. 10/12/2007 22:29:29|climateprediction.net|Sending scheduler request: To send trickle-up message. Requesting 0 seconds of work, reporting 0 completed tasks 10/12/2007 22:29:35|climateprediction.net|Scheduler request succeeded: got 0 new tasks 10/12/2007 22:29:50|climateprediction.net|Sending scheduler request: To send trickle-up message. Requesting 0 seconds of work, reporting 0 completed tasks 10/12/2007 22:29:55|climateprediction.net|Scheduler request succeeded: got 0 new tasks Lots more of these every 5/6 seconds until ........ 10/12/2007 22:38:01|climateprediction.net|Sending scheduler request: To send trickle-up message. Requesting 0 seconds of work, reporting 0 completed tasks 10/12/2007 22:38:06|climateprediction.net|Scheduler request succeeded: got 0 new tasks 10/12/2007 22:38:54|climateprediction.net|Starting hadsm3fub_0305_005913862_6 10/12/2007 22:38:54|climateprediction.net|Starting task hadsm3fub_0305_005913862_6 using hadsm3 version 506 10/12/2007 22:39:07|lhcathome|Sending scheduler request: To fetch work. Requesting 158849 seconds of work, reporting 0 completed tasks 10/12/2007 22:39:12|lhcathome|Scheduler request succeeded: got 0 new tasks 10/12/2007 22:39:14|climateprediction.net|Computation for task hadsm3fub_0361_005911833_8 finished 10/12/2007 22:39:14|climateprediction.net|Output file hadsm3fub_0361_005911833_8_3.zip for task hadsm3fub_0361_005911833_8 absent 10/12/2007 22:39:14|SETI@home|Resuming task 16no06aa.7435.8661.10.6.137_1 using setiathome_enhanced version 527 10/12/2007 22:40:19|World Community Grid|Resuming task dddt0201m0751_ZINC00068317-0000_00_0 using dddt version 510 10/12/2007 22:40:19|World Community Grid|Resuming task dddt0201m0754_ZINC05090329-0000_00_0 using dddt version 510 10/12/2007 22:40:28|World Community Grid|Resuming task dddt0201m0754_ZINC04687985-0000_00_0 using dddt version 510 10/12/2007 22:41:43|rosetta@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 2 completed tasks 10/12/2007 22:41:48|rosetta@home|Scheduler request succeeded: got 0 new tasks 10/12/2007 22:42:33|climateprediction.net|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 1 completed tasks 10/12/2007 22:42:38|climateprediction.net|Scheduler request succeeded: got 0 new tasks Regards Mike |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
The scheduler messages might indicate that it\'s something to do with networking? There is a bug in boinc which can cause crashes if the local network fails (i.e., when dialing in, or a firewall crash). The second model would have started to download+run when the first crashed. I'm a volunteer and my views are my own. News and Announcements and FAQ |
©2024 cpdn.org