climateprediction.net home page
Laptop losing work when client is shut down

Laptop losing work when client is shut down

Questions and Answers : Windows : Laptop losing work when client is shut down
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user8176

Send message
Joined: 1 Sep 04
Posts: 3
Credit: 21,635
RAC: 0
Message 6068 - Posted: 12 Nov 2004, 19:52:36 UTC

Hi,

I\'ve read previous posts, etc, about this, but the issue is still unclear. I have a laptop which I run BOINC on, but it is often shut down or placed into hibernation. I have just noticed that my client is not progressing, and every time I boot it up the CPU time column shows the same value and my progress is reset. It\'s probably been doing this for months. Is there some way to get the client to save it\'s current state when it\'s shut down? This honestly should be a pretty basic function for any distributed project. Do I just have a bad work unit?
ID: 6068 · Report as offensive     Reply Quote
Profile old_user11965

Send message
Joined: 4 Sep 04
Posts: 61
Credit: 80,585
RAC: 0
Message 6070 - Posted: 13 Nov 2004, 0:17:21 UTC
Last modified: 13 Nov 2004, 0:19:04 UTC

CPDN is noted for having checkpoints that are very far apart. If you close BOINC often, it's possible that the project is being shutdown before it gets to its next checkpoint, which would effectively nix any progress for the WU. If you're just suspending/hibernating the system, that shouldn't cause any trouble.

For the same reason, I also have my preferences set to keep applications in memory when paused.

BTW, just to be on the safe side, I manually exit BOINC when shutting down the system.

trane

ID: 6070 · Report as offensive     Reply Quote
old_user8176

Send message
Joined: 1 Sep 04
Posts: 3
Credit: 21,635
RAC: 0
Message 6071 - Posted: 13 Nov 2004, 1:00:50 UTC - in response to Message 6070.  

Hi Trane,

Thanks for your answer. I do manually exit BOINC in most cases, but it makes no difference to CPU time lost. I wonder if an admin could answer my question about saving state. It must be possible to save state when shutting down, and recover when starting back up. If I have to let me clients run for significant portions of time or risk losing all the work simply because I shut the client off, then I'm afraid that I won't be able to run this project. Couldn't this functionality be added in to a future version of BOINC?
ID: 6071 · Report as offensive     Reply Quote
old_user1244
Avatar

Send message
Joined: 26 Aug 04
Posts: 13
Credit: 5,458
RAC: 0
Message 6072 - Posted: 13 Nov 2004, 3:21:12 UTC - in response to Message 6071.  

> no difference to CPU time lost. I wonder if an admin could answer my question
> about saving state. It must be possible to save state when shutting down,

<a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=1099">Maybe you can contact Tolu about it</a>.

<A HREF="http://www.geocities.com/dlihooya/logo/logo.htm"><IMG SRC="http://boinc.mundayweb.com/cpdn/stats.php?userID=64&amp;trans=off"></A>
ID: 6072 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 6076 - Posted: 13 Nov 2004, 14:37:24 UTC

The checkpoints are every 144 timestep. Depending upon the computer, this would typically be 5 to 15 minutes.

If you are worried about this 5 to 15 minutes then it is best to have preference leave application in memory while suspended to yes. Then whereever possible just suspend BOINC rather than exit.

5 - 15 minutes is not an aweful lot to loose each time you have to exit rather than suspend. I suspect your problem is not the 5 - 15 minutes but perhaps your work unit is returning to the beginning when it shouldn't.

If a problem is encountered the client should re-try 3 times once from the last 144 timestep once from beginning of model month and once from beginning of model year. Tracking down why some model go back to 0% would be ideal but it takes a lot of time and there are lots of new things the CP team want to move on to doing.
ID: 6076 · Report as offensive     Reply Quote
old_user8176

Send message
Joined: 1 Sep 04
Posts: 3
Credit: 21,635
RAC: 0
Message 6093 - Posted: 14 Nov 2004, 6:55:12 UTC - in response to Message 6076.  

Hi crandles,

Unfortuntely, suspending the client doesn't help when you are shutting the machine off. 5 to 15 minutes might not seem like much on an individual basis, but from a broad view of the whole project, this seems like a significant design oversight. If you take a low estimate and assume that 10% of all machines running BOINC CPDN experience a shutdown or interruption each day, with the current number of machines, you're looking at anywhere from 300 to 900 CPU hours lost <i>each day</i>. If you consider that this estimate might be higher, say 25%, this could reach upwards of 2000 CPU hours lost each day. That's a lot of work.

I'm not trying to be difficult. As a long-time support of distributed computing projects (very long time), I'm just baffled as to why it was designed without some sort of state saving.
ID: 6093 · Report as offensive     Reply Quote
Profile old_user11965

Send message
Joined: 4 Sep 04
Posts: 61
Credit: 80,585
RAC: 0
Message 6095 - Posted: 14 Nov 2004, 9:13:31 UTC - in response to Message 6093.  

&gt; computing projects (very long time), I'm just baffled as to why it was
&gt; designed without some sort of state saving.

I think it probably has something to do with CPDN under BOINC being a port of the stand-alone application. In such a scenario where CPDN wouldn't be competing for CPU time with other projects, the potential for losing CPU time is drastically reduced. I agree it could be greatly improved. Apparently, a future version of BOINC will take checkpoints into account when swapping processes, so we won't be losing work on that front. Unfortunately, I don't see much chance of solving the 144-timestep issue.
ID: 6095 · Report as offensive     Reply Quote

Questions and Answers : Windows : Laptop losing work when client is shut down

©2024 cpdn.org