climateprediction.net home page
Backing up CPDN wu's.

Backing up CPDN wu's.

Questions and Answers : Windows : Backing up CPDN wu's.
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,060,823
RAC: 638
Message 7554 - Posted: 24 Jan 2005, 19:22:55 UTC

Every day or so, when BOINC is either inactive, or working on another project, I make a copy of the CPDN folder which lives in the "Projects" folder.

The reason I started to do that is because I have had a couple of wu compute for hundreds of hours before crashing. I would then be able to restore the directory from the last save, and resume crunching it.

Question, is that going to cause any problems at CPDN? I'm wondering along lines like, CPDN has flagged wu abc as dead, then the next day, gets a trickle from it?
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 7554 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 7556 - Posted: 24 Jan 2005, 20:10:03 UTC
Last modified: 24 Jan 2005, 20:10:32 UTC

>CPDN has flagged wu abc as dead, then the next day, gets a trickle from it?

That isn't a problem. However,

Thyme Lawn wrote

You'd have to replace the whole BOINC directory because all the information controlling the jobs run by BOINC is held in the client_state.xml file and slots directory.

see http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=1559
Visit BOINC WIKI for help

And join BOINC Synergy for all the news in one place.
ID: 7556 · Report as offensive     Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,060,823
RAC: 638
Message 7585 - Posted: 25 Jan 2005, 11:19:34 UTC

The reason I asked yesterday is because my model crashed yesterday! I have the backup of the CPDN folder from the projects directory but did not know about the slots business.

Next question then, I can see a single xml file in "slots1", can I edit this file to get the saved model to restart, (at the moment, it says 100% complete and ready to report).
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 7585 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 7590 - Posted: 25 Jan 2005, 13:09:20 UTC - in response to Message 7585.  

> Next question then, I can see a single xml file in "slots1", can I edit this
> file to get the saved model to restart, (at the moment, it says 100% complete
> and ready to report).

It can be done (I did it a couple of days ago), but you do need a client_state.xml or client_state_prev.xml file that still has the file_info sections for the result files (they will only be accepted if the signed_xml and xml_signature blocks agree with what the server sent out).

If you've got a suitable client_state file I should be able to help you out. Take a copy of the file (just in case), PM me on the Team Picard site and then we can talk by email.

Ian
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 7590 · Report as offensive     Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,060,823
RAC: 638
Message 7928 - Posted: 28 Jan 2005, 16:59:51 UTC

I looked around and have been unable to find anything. The fault has cost me my CPDN unit, 2 complete but uncommunicated SETI wu's and a Predictor wu. :( :(

So for the future, I need to save projects->cpdn and what? If I save all the slots, if I get into trouble again and restore slots->* will I screw up SETI, Predictor or LHC?
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 7928 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 7938 - Posted: 28 Jan 2005, 17:49:55 UTC

Running multiple projects changes things a bit. I only run CPDN, so tend to forget things like that!

Your client_state.xml and projects/cpdn folder are certainly going to need backing up, but you're going to have to merge the CPDN data from the backed up client_state.xml into the current one. I'll going to do some playing around on a non-networked system at the weekend and will give an update on Monday.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 7938 · Report as offensive     Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,060,823
RAC: 638
Message 8052 - Posted: 29 Jan 2005, 14:26:32 UTC
Last modified: 29 Jan 2005, 14:31:17 UTC

Thanks. I'd appreciate knowing this. I've lost 2 CPDN wu's now with hundreds of hours on each.

BTW, I tried to register at your forums, but I got a rather splendid error!

>>>
Could not insert data into im_prefs table

DEBUG MODE

SQL Error : 1196 Warning: Some non-transactional changed tables couldn't be rolled back

INSERT INTO phpbb_im_prefs (user_id, themes_id) VALUES (1000308, 1)

Line : 291
File : /home/teampi/public_html/profilcp/profilcp_register.php

*** EDIT ***

Although when I tried to log in, it did so, so perhaps the error is unimportant!

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 8052 · Report as offensive     Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,060,823
RAC: 638
Message 8536 - Posted: 3 Feb 2005, 13:57:46 UTC
Last modified: 3 Feb 2005, 13:58:20 UTC

Anybody got any firm answers to this? I want to protect my CPDN units, but not mangle S@H, P@H or LHC.

I have all running again, (although there is no LHC work at the moment), and am storing the CPDN dirctory under "Projects", the 0 directory under "Slots", this seems to be the one that has CPDN type stuff in it, and client_state.xml file, which on my system, is not in the slots structure, but one of many standalone files in the BOINC directory, (and seems filled with details relating to all running projects).
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 8536 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 8570 - Posted: 3 Feb 2005, 18:19:10 UTC
Last modified: 3 Feb 2005, 18:52:46 UTC

Sorry - I managed to lose this thread after the weekend :(

It's best to stop BOINC (or at the very least suspend computation) before doing a backup to make sure you don't try to access volatile data.

If you're looking to cater for hard disk failure everything in the BOINC directory needs to be backed up.

For a CPDN specific backup to allow recovery of a model you only need to back up the following things:

1) the client_state.xml file in the BOINC directory

2) the xml and zip files in your projects/climateprediction.net directory

3) everything in the projects/climateprediction.net/dataout with a timestamp <= phist.year

4) the slots directory used by the CPDN jobs (I think the cpdnout* files are required for BOINC to pick up the result files and the init_data.xml file may have some part to play in when BOINC starts up)

I've got another job that's due to finish this weekend and I'm hoping to play around with it to find out if the slots directory is really significant.

The real problems start if you ever need to restore a job. What it really needs is an intellegent tool that merges the relevant parts from the backed up client_state.xml file into the current one, deletes any files in the dataout directory that are newer that the newest file restored (to prevent any possibility of the results being corrupted) and sort out the slots directory (if an error result has been uploaded BOINC will have reused the slot for another job, so a new one will have to be created).

And yes, I have been thinking about producing just such a tool - the only problem is finding the time to do it.

Edit 1. I hate this forum! Everything after the < in point 3 was deleted because I forgot about HTML tags until I'd posted, and now it doesn't maintain the message content when you go back. So I had to type in everything from that point on again! Grrrr!!!

Edit 2. And the edit form maintains the message content when you go back. I guess the behaviour of the posting form was changed to prevent people from making duplicate posts, but it didn't half annoy me this time!

Edit 3. Fortunately I'd taken a copy before submitting that last one, because the second edit had put < back into point 3 (instead of my < tag) and everything after it got stripped back out again!!!
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 8570 · Report as offensive     Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,060,823
RAC: 638
Message 8633 - Posted: 4 Feb 2005, 15:34:52 UTC

Okay, I am backing up the entire Projects->CPDN directory, minus the folders pertaining to old units which I've either deleted, or for my one completed unit, moved elsewhere.

I'm copying client_state.xml and for good measure client_state_prev.xml.

I'm copying slots->0 as this has the CPDN stuff in it.

So I should be okay then.

If you can formalise a specification for what needs to be done to build this intelligent backup tool, I'll write it if you don't have time. Or perhaps we could establish an open source type project to do this?

I have subscribed to this thread.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 8633 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 9075 - Posted: 10 Feb 2005, 11:09:56 UTC - in response to Message 8633.  

> I'm copying slots->0 as this has the CPDN stuff in it.

Tested with a WU on its completion. Everything runs fine with an emtpy slots directory until it comes to BOINC returning the results. If the cpdnout* files are missing an error gets returned even though the results were created correctly.

> If you can formalise a specification for what needs to be done to build this
> intelligent backup tool, I'll write it if you don't have time. Or perhaps we
> could establish an open source type project to do this?

I can certainly knock together a specification of what needs to be done and should to be able to do some back-end coding - user interfaces aren't really my thing!
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 9075 · Report as offensive     Reply Quote

Questions and Answers : Windows : Backing up CPDN wu's.

©2024 cpdn.org