Questions and Answers : Unix/Linux : Workunit uses lots of disk space+restart/recover the same task
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Dear all, Recently I resumed my computer contribution to CPDN, so I'm still a bit rusty and need some refreshing. I It would be great if one can help me. One of the problems I'm encountering is that my current model UK Met Office HadAM3P (global only) with MOSES II landsurface scheme link to WUI http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=9546200 at 6% progress occupies 3.1 GB hard disk space in hadam3pm2_k243_1959_10_009463966/dataout Shouldn't be much less? Update: While writing this I accidentally deleted the model directory and BOINC exited with error when tried to write at CPU checkpoint. I recovered the folder, but it seems I cannot restart the same task. Can I? Any suggestions? Should I simply deleted all files and start new task? Cheers |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
I'm not sure on the typical size of these folders as I am away from my computers now. However, the model folders do get big. Unfortunately you cannot recover your model despite restoring that folder. Your client_state.xml file no longer contains info about it since boinc thought that task errored out. Go ahead and delete that folder and start a new task. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Thanks. Let's see how it goes. My other Linux machine errored out earlier on another hadam3p model. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
If you don't like the idea of errors, I'd go run the hadcm3s models as opposed to the MOSES ones. The MOSES ones hate to be interrupted for any reason. Which kind of goes against the idea of distributed computing running science apps when your computer is otherwise idle. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
|
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Thanks, but still I use these machines and I need to shut them down almost every day. So I wait for CPU checkpoint and then suspend, then exit and shut down. I mean I can't leave the machine >450h CPU at 100% to finish uninterrupted hadma3p models?! On a laptop CPUs running at 100% all the time are way too hot and I try to give CPDN some CPU computing while I'm working as not much idle time on these machines. I read some time ago that there is no way to make some of these models smaller and less error prone. But if I manage to complete less than 20% of tasks, then 80% of the computing time is just lost. Nevertheless I will set some of the preferences as suggested. Thanks mates |
©2024 cpdn.org