climateprediction.net home page
Lost Work Unit

Lost Work Unit

Questions and Answers : Windows : Lost Work Unit
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user7012

Send message
Joined: 31 Aug 04
Posts: 2
Credit: 296,668
RAC: 0
Message 3959 - Posted: 12 Sep 2004, 12:17:37 UTC

I have had a system crash on my computer due to a driver problem after installation of XP SP2. When this happened it seems that Boinc reset my projects and lost the climateprediction work unit, but the \'missing\' work unit is still on my computer but does not reappear on the Boinc application.

How can I reattach the work unit stored on my computer such that this can be completed instead of lost ?

I use a Dell Inspiron 9100 with P4 3GHz Dual processor, XP Home Edition with SP2, 512MBytes RAM and currently run Boinc 4.05 and projects SETI@home and ClimatePredictions.net

ID: 3959 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 3964 - Posted: 12 Sep 2004, 13:33:05 UTC

hmmm, it sounds like the client_state.xml file in boinc got messed up. There may be a backup file (named "client_state_prev.xml") that you could experiment with copying over?

ID: 3964 · Report as offensive     Reply Quote
old_user7012

Send message
Joined: 31 Aug 04
Posts: 2
Credit: 296,668
RAC: 0
Message 3982 - Posted: 12 Sep 2004, 22:28:22 UTC - in response to Message 3964.  

> hmmm, it sounds like the client_state.xml file in boinc got messed up. There
> may be a backup file (named "client_state_prev.xml") that you could experiment
> with copying over?
>

No can do, as the client_state file is saved as the 'backup' client_state_prev before the client_state file is written to as a result of workunit activities.

Besides, the workunit zip files that are downloaded are protected in the client_state file through checksums and file signatures. Perhaps this means the work unit is 'lost' forever and should be removed from my computer, or perhaps the climateprediction administrators could create some sort of file sequence that could be used to reload the workunit into the Boinc application ?

I believe the client_state file was not necessarily corrupted, but rather suffered from a Boinc Reset that cleared this file for all workunits - I also lost around 20 SETI@home workunits as well.

ID: 3982 · Report as offensive     Reply Quote
old_user16851

Send message
Joined: 12 Sep 04
Posts: 1
Credit: 344,595
RAC: 0
Message 5150 - Posted: 8 Oct 2004, 22:39:19 UTC - in response to Message 3982.  

Right, got the same problem. Two modells lost one machine due to crash of pc.
Can´t the admin recreate the signatures and checksums?



> > hmmm, it sounds like the client_state.xml file in boinc got messed up.
> There
> > may be a backup file (named "client_state_prev.xml") that you could
> experiment
> > with copying over?
> >
>
> No can do, as the client_state file is saved as the 'backup' client_state_prev
> before the client_state file is written to as a result of workunit
> activities.
>
> Besides, the workunit zip files that are downloaded are protected in the
> client_state file through checksums and file signatures. Perhaps this means
> the work unit is 'lost' forever and should be removed from my computer, or
> perhaps the climateprediction administrators could create some sort of file
> sequence that could be used to reload the workunit into the Boinc application
> ?
>
> I believe the client_state file was not necessarily corrupted, but rather
> suffered from a Boinc Reset that cleared this file for all workunits - I also
> lost around 20 SETI@home workunits as well.
>
>
>
ID: 5150 · Report as offensive     Reply Quote
old_user169

Send message
Joined: 5 Aug 04
Posts: 39
Credit: 87,633
RAC: 0
Message 5189 - Posted: 10 Oct 2004, 17:40:20 UTC - in response to Message 3964.  
Last modified: 10 Oct 2004, 17:44:00 UTC

> hmmm, it sounds like the client_state.xml file in boinc got messed up. There
> may be a backup file (named "client_state_prev.xml") that you could experiment
> with copying over?


Similar problem, different question :

Something similar happened to me too while attaching to S@H - system freeze and had to reinstall BOINC. Both client_state(_prev).xml didn't contain those two (it's a 2-CPU machine) Climate models anymore so they are lost for this run.

Now the question :

Wouldn't it make sense for a project with long running work units to find a way to invalidate them? If it works like it does now, the first chance that those models will be delivered again will be in about one year so the result might arrive in maybe two years - needs a lot of patience for the project scientists :-/
ID: 5189 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 5191 - Posted: 10 Oct 2004, 18:56:51 UTC - in response to Message 5189.  
Last modified: 10 Oct 2004, 18:57:29 UTC

> Wouldn't it make sense for a project with long running work units to find a
> way to invalidate them? If it works like it does now, the first chance that
> those models will be delivered again will be in about one year so the result
> might arrive in maybe two years - needs a lot of patience for the project
> scientists :-/

Where BOINC returns an error to the server then this is recorded and the WU can be re-sent.

The problem is with WUs that have been lost in virtual space for whatever reason, and it is just this lack of information that creates the problem. For example, we have just had a question raised on the PHP board about whether it would be possible to run a WU without any internet connection at all (apart, of course from the download at the start and upload at the end). It wasn't explained why, but there was presumably a good reason for the query. Although the project only treats as active machines trickling in the past two weeks, there are those who run on machines that don't normally connect to the internet and those who want to run the model on slow machines not running 24/7 (which is the reason for the very long deadline).

I suppose that given the very large number of permutations of the experiment, and the impracticability of covering them all, the scientists are fairly relaxed about losing a few.
ID: 5191 · Report as offensive     Reply Quote

Questions and Answers : Windows : Lost Work Unit

©2024 cpdn.org