climateprediction.net (CPDN) home page
Thread 'progress back to 0%'

Thread 'progress back to 0%'

Message boards : Number crunching : progress back to 0%
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user127136

Send message
Joined: 3 Dec 05
Posts: 3
Credit: 681,745
RAC: 0
Message 18370 - Posted: 18 Dec 2005, 20:12:15 UTC
Last modified: 18 Dec 2005, 20:13:45 UTC

My machine is HT enabled, so I have two CP running. One was at 10% 200+ hrs and the other was at 8% 150+ hrs this morning. I just discovered that the two process are now both 0% with 189 hrs on the first, 138 hrs on the 2nd. Is this a known problem? Or will this affect my end result? I don\'t want to spend CPU time on number crunching and get results that are not going to help with the scientific research.

Any idea what went wrong? It all happened in just a few hours.

PS. Right after I post this message, the progress percentage is back, not to what it used to be, still. One at 9% the other at 7%. It seems like there is a data roll back or something close to that.
ID: 18370 · Report as offensive     Reply Quote
Arnaud

Send message
Joined: 3 Sep 04
Posts: 268
Credit: 256,045
RAC: 0
Message 18373 - Posted: 18 Dec 2005, 20:25:09 UTC

Yeap,
When there is a problem the models can rewind of one day/month/or year.
It\'s probably what has happened because your account doesn\'t show any error.
Your should see a change in the s/ts if the models have rewound
Arnaud
ID: 18373 · Report as offensive     Reply Quote
old_user127136

Send message
Joined: 3 Dec 05
Posts: 3
Credit: 681,745
RAC: 0
Message 18375 - Posted: 18 Dec 2005, 20:41:55 UTC - in response to Message 18373.  
Last modified: 18 Dec 2005, 20:45:24 UTC

This is annoying. I just had a computation error with the 9% process. Now the result is completely ruined. I have to start with another WU. I wonder if I will ever be able to finish a single work unit.

SETI analyze the data after result from each work unit is uploaded. How does CP uses the result here? Each WU takes thousands of CPU hours to complete. And from my experience with BOINC, computation error is not that rare. What if I will never be able to finish a single WU calculation. Will my CPU time be worth of anything to the scientific research?

And how much disk space do I need to allocate for CP? I just tried to get a new work unit, but got the this message instead:
12/18/2005 12:54:51 PM|climateprediction.net|Message from server: Not enough disk space (only 348.4 MB free for BOINC). Review preferences for maximum disk space used.

ONLY 348 MB free for BOINC?
ID: 18375 · Report as offensive     Reply Quote
racinjimy

Send message
Joined: 19 Apr 05
Posts: 53
Credit: 6,325,436
RAC: 0
Message 18376 - Posted: 18 Dec 2005, 20:56:27 UTC - in response to Message 18375.  

from the BOINC WIKI:
Sulphur Cycle

The Sulphur Cycle experiment is one of the Climateprediction.net Models. It was launched on 26th August 2005.

This will model sulphur in several different compound forms including sulphates, sulphuric acid, DMS and others.

For more about the science invovled see Sulphur Cycle page.
[edit]
More about running the model

CPDN places heavy demands on personal computers. This was demonstrated in Classic CPDN and in HadSM3 on BOINC. Sulphur Cycle continues the demand and adds significant new considerations.
Sulphur Cycle Time Steps require around 70% more CPU time as compared with HadSM3 Time Steps.
Sulphur Cycle adds two Phases to the Run -- so, longer Time Steps and five Phases -- totals as much as 2.8 times longer than a HadSM3 Run on the same machine.
For CPDN-dedicated machines:
On an Athlon 64 3200+, in WinXP, running stand-alone, one Sulphur Cycle Model requires approximately 53 days (@24/7).
On an Athlon 64 3400+, in WinXP, running stand-alone, one Sulphur Cycle Model requires approximately 47 days (@24/7).
On an Athlon 64 3800+, in WinXP, running stand-alone, one Sulphur Cycle Model requires approximately 42 days (@24/7).
On a P4 3.0 in WinXP, running stand-alone, one Sulphur Cycle Model requires approximately 45 days (@24/7).
On a P4 3.0 in Linux, running stand-alone, one Sulphur Cycle Model requires approximately 39 days (@24/7).
On a P4 3.0 in Linux, running two Sulphur Cycle Models in parallel require approximately 65 days (@24/7).
On a P4 3.0 in Win2K, 1 Sulphur Cycle + 1 HADSM3, Sulphur Cycle requires approx 79 days (28 for HADSM3, which makes S-Cycle 2.8 times as long). [~25&65 days in Linux]
On a P4 3.4 in Linux, running stand-alone, one Sulphur Cycle Model requires approximately 34 days (@24/7).
Sulphur Cycle requires much more disk space than HadSM3 version. SC requires about than 2.7 Gigabytes of Hard Disk Storage (about 5.4 Gigabytes for two concurrent Runs) in Phase 5 prior to the end.
At the end of a successful Run, there are 2,327 zipped files consuming 1GB of Disk space. Unless off-loaded, this 1GB per finished Model is added to the Hard Disk requirement for the next Run; a failed Run could leave considerably more than 1GB.
Post-Phase processing is more extensive than in HadSM3 and the results upload at the end of each Phase, rather than all at Run\'s end.
Given the length of the WU, backups are more important in Sulphur Cycle than in HadSM3. The BOINC directory should be backed-up periodically, near the end of each Phase as a minimum.
Note: backups take longer because of increased Folder size.
The application files consume +/- 35MB of download bandwidth. Modem users be aware...
The deadline has been set to be rather short - 5 months. This is because getting some results is a requirement for the coupled model hindcast which is planned for February 2006.

This is likely to cause a lot of computers to show as overcommitted and start crunching only the sulphur cycle model until it is finished. Results returned after the deadline should not be rejected.

If you have one computer and a few projects you may want to consider whether you want to put this much resources into CPDN or not.

If you have several computers and a few projects you may want to consider putting the Sulphur Cycle on one (or a few) of the faster computers with little other work while setting other computers not to download CPDN work.

Many thanks to AstroWX for starting this information on the Beta site.

ID: 18376 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 18377 - Posted: 18 Dec 2005, 20:57:48 UTC

cpdn will need a couple of gig, so 4 to be safe.
The only way BOINC knows how much space it\'s allowed to use is by the settings on your Account page.
The apps on this project need a VERY stable computer. Just because it will run the other projects OK, means nothing.
Overclocking, overheating, inadequate power supply, etc, can all be a problem.

ID: 18377 · Report as offensive     Reply Quote
ProfileHonza
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 390
Credit: 2,475,242
RAC: 0
Message 18380 - Posted: 18 Dec 2005, 21:19:02 UTC

Agree with Les - CPDN is quite demanding and sensitive to HW overall stability.
On the other hand, CPDN can be VERY stable - on one my machines, I have 33 completed models in a row (including Sulphur cycle) [ with only 2 failed during early alpha/beta stage last summer, which is not surpricing].
<i>phpBB forum for CPDN, all are </i><a href="http://www.climateprediction.net/board">invited</a>
ID: 18380 · Report as offensive     Reply Quote
old_user94880

Send message
Joined: 27 Aug 05
Posts: 156
Credit: 112,423
RAC: 0
Message 18390 - Posted: 19 Dec 2005, 2:50:21 UTC

Just for the record BOINC does not cause processing errors.........
BOINC Wiki
ID: 18390 · Report as offensive     Reply Quote
old_user127136

Send message
Joined: 3 Dec 05
Posts: 3
Credit: 681,745
RAC: 0
Message 18392 - Posted: 19 Dec 2005, 2:59:21 UTC - in response to Message 18390.  

got it, thanks. I now changed my setting to 2GB. My machine is dedicated to CPDN so disk space shouldn\'t be a problem any more. I hope this time there won\'t be any problem until I finish the work unit.
ID: 18392 · Report as offensive     Reply Quote

Message boards : Number crunching : progress back to 0%

©2024 cpdn.org