Questions and Answers :
Windows :
Laptop losing work when client is shut down
Message board moderation
Author | Message |
---|---|
Send message Joined: 1 Sep 04 Posts: 3 Credit: 21,635 RAC: 0 |
Hi, I\'ve read previous posts, etc, about this, but the issue is still unclear. I have a laptop which I run BOINC on, but it is often shut down or placed into hibernation. I have just noticed that my client is not progressing, and every time I boot it up the CPU time column shows the same value and my progress is reset. It\'s probably been doing this for months. Is there some way to get the client to save it\'s current state when it\'s shut down? This honestly should be a pretty basic function for any distributed project. Do I just have a bad work unit? |
Send message Joined: 4 Sep 04 Posts: 61 Credit: 80,585 RAC: 0 |
CPDN is noted for having checkpoints that are very far apart. If you close BOINC often, it's possible that the project is being shutdown before it gets to its next checkpoint, which would effectively nix any progress for the WU. If you're just suspending/hibernating the system, that shouldn't cause any trouble. For the same reason, I also have my preferences set to keep applications in memory when paused. BTW, just to be on the safe side, I manually exit BOINC when shutting down the system. trane |
Send message Joined: 1 Sep 04 Posts: 3 Credit: 21,635 RAC: 0 |
Hi Trane, Thanks for your answer. I do manually exit BOINC in most cases, but it makes no difference to CPU time lost. I wonder if an admin could answer my question about saving state. It must be possible to save state when shutting down, and recover when starting back up. If I have to let me clients run for significant portions of time or risk losing all the work simply because I shut the client off, then I'm afraid that I won't be able to run this project. Couldn't this functionality be added in to a future version of BOINC? |
Send message Joined: 26 Aug 04 Posts: 13 Credit: 5,458 RAC: 0 |
> no difference to CPU time lost. I wonder if an admin could answer my question > about saving state. It must be possible to save state when shutting down, <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=1099">Maybe you can contact Tolu about it</a>. <A HREF="http://www.geocities.com/dlihooya/logo/logo.htm"><IMG SRC="http://boinc.mundayweb.com/cpdn/stats.php?userID=64&trans=off"></A> |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
The checkpoints are every 144 timestep. Depending upon the computer, this would typically be 5 to 15 minutes. If you are worried about this 5 to 15 minutes then it is best to have preference leave application in memory while suspended to yes. Then whereever possible just suspend BOINC rather than exit. 5 - 15 minutes is not an aweful lot to loose each time you have to exit rather than suspend. I suspect your problem is not the 5 - 15 minutes but perhaps your work unit is returning to the beginning when it shouldn't. If a problem is encountered the client should re-try 3 times once from the last 144 timestep once from beginning of model month and once from beginning of model year. Tracking down why some model go back to 0% would be ideal but it takes a lot of time and there are lots of new things the CP team want to move on to doing. |
Send message Joined: 1 Sep 04 Posts: 3 Credit: 21,635 RAC: 0 |
Hi crandles, Unfortuntely, suspending the client doesn't help when you are shutting the machine off. 5 to 15 minutes might not seem like much on an individual basis, but from a broad view of the whole project, this seems like a significant design oversight. If you take a low estimate and assume that 10% of all machines running BOINC CPDN experience a shutdown or interruption each day, with the current number of machines, you're looking at anywhere from 300 to 900 CPU hours lost <i>each day</i>. If you consider that this estimate might be higher, say 25%, this could reach upwards of 2000 CPU hours lost each day. That's a lot of work. I'm not trying to be difficult. As a long-time support of distributed computing projects (very long time), I'm just baffled as to why it was designed without some sort of state saving. |
Send message Joined: 4 Sep 04 Posts: 61 Credit: 80,585 RAC: 0 |
> computing projects (very long time), I'm just baffled as to why it was > designed without some sort of state saving. I think it probably has something to do with CPDN under BOINC being a port of the stand-alone application. In such a scenario where CPDN wouldn't be competing for CPU time with other projects, the potential for losing CPU time is drastically reduced. I agree it could be greatly improved. Apparently, a future version of BOINC will take checkpoints into account when swapping processes, so we won't be losing work on that front. Unfortunately, I don't see much chance of solving the 144-timestep issue. |
©2024 cpdn.org