climateprediction.net (CPDN) home page
Thread 'Phase number jumped back to 1'

Thread 'Phase number jumped back to 1'

Message boards : Number crunching : Phase number jumped back to 1
Message board moderation

To post messages, you must log in.

AuthorMessage
Brandon

Send message
Joined: 17 Dec 06
Posts: 2
Credit: 44,571
RAC: 0
Message 38740 - Posted: 18 Jan 2010, 6:32:02 UTC
Last modified: 18 Jan 2010, 6:32:47 UTC

About 1800 timestamps into phase 2, while viewing the graphics I created a screendump. Around the same time the phase number went from 2 to 1. The work unit\'s year and time are still normal (Jan. 1926) but the progress is < 1%. What should i do?

http://clip2net.com/page/m0/3472350 - Latest Screendump

Edit: I also replayed some video from phase 1 around the same time.
ID: 38740 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 38741 - Posted: 18 Jan 2010, 7:01:51 UTC
Last modified: 18 Jan 2010, 7:02:48 UTC

This is a known problem when the hadam3 (multi phase), type of model gets interrupted at the end of a phase, and before it\'s reached the next (checkpoint? trickle_up?) in the next phase.
There\'s a LOT of data files created during each phase, and it takes quite a while, (10-30 minutes? depends on the processor speed and processor load), to add all of this data to a zip file for uploading.

ANY interruption to this process, and the model will rewind, as you\'ve find out.
The \"speed\" may slow to thousands of seconds, then rapidly decrease to about the original speed (seconds/TimeStep).
And now the model will begin the long process of being re-created, until it finally reaches the point where it was before, at which point you\'ll start getting credits for the \"new\" trickles.
Aborting may be the best bet.

This has been mentioned time after time, both in the News and Announcement thread, and in individual posts. I posted a reply to someone recently about this. :(
Backups: Here
ID: 38741 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 38742 - Posted: 18 Jan 2010, 16:45:52 UTC

To help prevent the loss of large amounts of work from happening again you might want to get into the habit of making backups a few (real world) hours before a phase change. It’s simple to do and only takes a couple of minutes. That way if the model should rewind you can just substitute the backup copy and you only loose a few hours of crunching.

ID: 38742 · Report as offensive     Reply Quote

Message boards : Number crunching : Phase number jumped back to 1

©2024 cpdn.org