Message boards : Number crunching : Resuming computation from Backup after WorkUnit Error ?
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Aug 04 Posts: 77 Credit: 1,785,934 RAC: 0 |
Hi, I got one System that trashed two WorkUnits after considerable runtime due to System failure. After that, I reverted to a functional Backup which resumed computation from an early stage of both WorkUnits, which run without problems so far. The only Problem : the Host already reported the two WorkUnits that errored out before I could intervene. Question : Will the System realize that a backup is rerunning and eventually change WorkUnit status again, or is it futile as the \"Computing Error\" has fixed the WorkUnit Status to \"Over\" ? (the Host is sending its Trickles without any Error messages, thus I assume the server is still accepting them) Scientific Network : 44800 MHz - 77824 MB - 1970 GB |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
Question : Will the System realize that a backup is rerunning and eventually change WorkUnit status again ... ?No. The work unit pages will show the first error received. BOINC doesn\'t really understand the concept of a backup - presumably because most BOINC projects have short work units that aren\'t worth backing up. CPDN has unusually long work units that are worth backing up. (the Host is sending its Trickles without any Error messages, thus I assume the server is still accepting them)Yes. Trickles and Zip file uploads will continue to be accepted by the server and credits awarded, as if there had never been a crash. |
Send message Joined: 30 Aug 04 Posts: 77 Credit: 1,785,934 RAC: 0 |
Cool, then I\'ll keep everything running, thanks for the quick Info :) Scientific Network : 44800 MHz - 77824 MB - 1970 GB |
Send message Joined: 28 Nov 06 Posts: 89 Credit: 12,023,653 RAC: 4,025 |
I detached by accident my host from CPDN (doing an unhappy \"experiment\" with BAM). I had 2 tasks in progress, both are marked now \"Over, Client detached\". I restored the tasks from backup and I am trying now to continue them. Question! Is my attempt correct? Is it possible to finish my tasks? Is it the same as restore after \"Client error, Compute error\"? Will the project server accept the trickles etc.? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Is my attempt correct?Yes. Is it possible to finish my tasks?Yes. Is it the same as restore after \"Client error, Compute error\"?Yes. Will the project server accept the trickles etc.?Yes. |
Send message Joined: 28 Nov 06 Posts: 89 Credit: 12,023,653 RAC: 4,025 |
19/02/2009 16:56:43|climateprediction.net|Sending scheduler request: To send trickle-up message. Requesting 0 seconds of work, reporting 0 completed tasks 19/02/2009 16:56:48|climateprediction.net|Scheduler request succeeded: got 0 new tasks Trickle is sent without Error message, but I don\'t see this trickle at task\'s page. -Edit- Les, You were faster as me! :-) Thank You! |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Please be patient. It sometimes takes a while for trickles to show up if the server is busy. |
Send message Joined: 28 Nov 06 Posts: 89 Credit: 12,023,653 RAC: 4,025 |
It is a deal! I will be patient. :-) Thank You again. |
©2024 cpdn.org