climateprediction.net (CPDN) home page
Thread 'Resuming same project after crash'

Thread 'Resuming same project after crash'

Questions and Answers : Preferences : Resuming same project after crash
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user107231

Send message
Joined: 9 Nov 05
Posts: 1
Credit: 1,181
RAC: 0
Message 17581 - Posted: 30 Nov 2005, 12:53:08 UTC

Recently my computer crashed therfore the claculating was aborted. When I started my computer again, it was not resuming the same work but asked a new one from CP. Are the data of this project (in the meantime 2GB) lost or can I make my computer resuming the started project?

Thanx for assistance
Stef
ID: 17581 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 17584 - Posted: 30 Nov 2005, 14:03:16 UTC

I\'m pretty sure the only way you would be able to resume working on that work unit is if you had a complete backup of the BOINC folder from before the crash. Otherwise, I don\'t think you can recover it.
ID: 17584 · Report as offensive     Reply Quote
old_user128789

Send message
Joined: 4 Dec 05
Posts: 1
Credit: 41,519
RAC: 0
Message 18415 - Posted: 19 Dec 2005, 18:34:42 UTC - in response to Message 17584.  

I\'m pretty sure the only way you would be able to resume working on that work unit is if you had a complete backup of the BOINC folder from before the crash. Otherwise, I don\'t think you can recover it.


What constitutes a full backup? I have what appears to be an intact folder for the stopped project analysis. Beyond that, CPDN appears to download four or five complete downloads per project. As long as they don\'t overwrite what was there before, one would think that the intact data of a discontinued project would be interesting, at the very least, for debugging purposes. More to the point, though, we have a discontinued analysis here. Shouldn\'t CPDN be interested, at the very least, that a particular analysis will never be received?

Davis
ID: 18415 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 18421 - Posted: 19 Dec 2005, 19:20:08 UTC

A complete backup isn\'t quite the only way to recover but what you do need is some information from an xml file (client_state.xml or client_state_prev.xml I think) about the old run. If BOINC has started and stopped too many times since the crash this information is lost and the directory from the crashed run is not enough to get it started again.

I don\'t know the full details of doing this, but if you can save a copy of the xml with the information then do post again and hopefully someone who does know the details may be able to help.

If it cannot be restarted the lack of trickles will eventually cause it to be automatically noted that no result will be forthcoming.

Sorry if this isn\'t an answer you wanted to hear.
Visit BOINC WIKI for help

And join BOINC Synergy for all the news in one place.
ID: 18421 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 18425 - Posted: 19 Dec 2005, 20:31:11 UTC
Last modified: 19 Dec 2005, 20:32:13 UTC

The critical things that are lost when a crashed model uploads are the result zip file downloaded from the scheduler, the <active_task> section for the result in client_state.xml and the contents of the result\'s links in the slots directory.

If the result has been reported you will also lose the <file_info>, <workunit> and <result> sections for the result from client_state.xml

All of these can be recovered from backups, but additional editing will be required if BOINC has started another result using the slot number the CPDN result had been using.

Some (end of phase?) failures can also cause the result\'s directory structure to be flattened out, leaving a set of post-processed and compressed files (these files are stored in the dataout sub-directory while the result is in progress). If this happens the only way to recover the result is by emptying its directory and restoring it from a backup.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 18425 · Report as offensive     Reply Quote

Questions and Answers : Preferences : Resuming same project after crash

©2024 cpdn.org