climateprediction.net (CPDN) home page
Thread 'Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion'

Thread 'Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion'

Message boards : Number crunching : Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 15 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 31039 - Posted: 20 Oct 2007, 20:48:27 UTC


The good news:
It\'s running!
It\'s apparently gone from a countdown of 270 through zero and back to 320, including updating client_state.xml

The last thing that I tried was changing what was being displayed. Some of the time it took a while to show the new parameter, and it was helpful to have the left hand overlay visible, so that I could see what one was current.

The bad news:
It seems to be looping.
It\'s back at the same day / timestep as before.

More later.

ID: 31039 · Report as offensive
old_user91851

Send message
Joined: 8 Aug 05
Posts: 9
Credit: 46,744
RAC: 0
Message 31040 - Posted: 20 Oct 2007, 20:54:10 UTC - in response to Message 31035.  

DKR, it looks as if your model has slowed right down, so much that it hasn\'t trickled for days, since 16 Oct. The 1.78 timestep is a cumulative figure - it\'s probably much slower than that now. If this really is the case, I would abort it as it still has quite a few years left to crunch.


Ok, thanks for replying.

I\'ve had nothing but bad results with CPDN on BOINC which is a shame because it is a cause I really want to help. I don\'t feel I can take any more work off it now though because something always goes wrong. All that crunching for no/little scientific worth.

Of course all the time BOINC is running it is a 20 W overhead on the power consumption on my PC. Not good. Running work for CPDN has been counter productive in my case.
ID: 31040 · Report as offensive
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 31041 - Posted: 20 Oct 2007, 22:08:15 UTC
Last modified: 20 Oct 2007, 22:09:41 UTC

Hang on a sec - I\'ve looked at your results for that computer and I think you\'re probably feeling unnecessarily disappointed because this model has turned out to be a bad apple.

Your 8 results in time order:

#1 Old-type slab, completed
#2 HadCM 160-year. It crashed with a -107 error, probably a graphics problem on your computer. In the README about avoiding crashes, item #5 by Mike advises how to avoid this sort of error. With a backup it could in any case have been rescued and completed.
#3 Your computer spent no time on it.
#4 HadCM 160-year. Did 81 years so it sent in the year 2000 restart dump and is quite likely being completed on another computer. All the data for 80 years sent to Oxford is used by the researchers. It crashed with a -161 error and could probably also have been saved by restoring a backup.
#5 Downloading error so no computer time lost.
#6 You aborted it, so no computer time lost.

Then there are your two current models, one of which you\'ve taken as far as you could. I expect the data for the first 2 phases will be used for the research. Your other model is doing well.

With models as long as these, the probability of something going wrong momentarily on the computer and crashing it is quite high. Eg just forgetting to suspend it before playing a game, or forgetting to exit from boinc before an AV scan or turning the computer off. The way to reach the end of most models is to take regular backups. In the READMEs there\'s a selection of methods.

But a backup wouldn\'t have saved your model that\'s slowed down - something\'s wrong with the model itself.

In any case, you\'ve done much more useful work for the project than you\'re giving yourself credit for.

Cpdn news
ID: 31041 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 31043 - Posted: 20 Oct 2007, 23:04:00 UTC
Last modified: 20 Oct 2007, 23:06:09 UTC

DKR
The problem with your first 3 crashes is probably because you were still using BOINC version 4.45, and the Coupled Ocean models need a version 5.* to work properly.


Backups: Here
ID: 31043 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 31046 - Posted: 21 Oct 2007, 4:54:54 UTC
Last modified: 21 Oct 2007, 5:00:07 UTC

Furthur checking showed that the model was indeed looping, from 9/6/2078 to 7/6/2078, and from a countdown of 261 straight back to 380.

This was with an 80 year Coupled Ocean model, not a slab model, but it may well be similar for slabs.
The display was a \'blue\' or \'white\' world, both temperature maps.
The cpu time in the Tasks tab was increasing normally, but the interval of timesteps was around 15 minutes. VERY slow.
Also, as mentioned before, the model had slowed dramatically a couple of weeks ago, from a trickle about every 15 hours to about one every 29 hours.

Model was backed up and then suspended, pending furthur possible need.

ID: 31046 · Report as offensive
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 31466 - Posted: 23 Nov 2007, 22:33:46 UTC
Last modified: 23 Nov 2007, 22:41:24 UTC

Found: One Blue Planet

WU:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6100707
Click on result 6938595 for computer 531684.

It\'s a slab model in phase 2. Note that phase 1 completed normally. Since I\'m at 62% done, I was going to let it complete phase 2 (66%) before aborting it. I completed a previous slab model recently with no problems. Is that the right time to abort it?

Edit: One thing I forgot to mention: I started running the \"Conservative\" CPUSpeed governor. As a result, sometimes BOINC runs benchmarks when the CPU is running on slow speed (but before the governor increases the speed). This causes my benchmarks to be very low. Any relation here?
ID: 31466 · Report as offensive
Profileold_user95198

Send message
Joined: 28 Aug 05
Posts: 1
Credit: 422,206
RAC: 0
Message 31467 - Posted: 23 Nov 2007, 22:54:13 UTC

Appears that I have one of these faulty models making no progress.
About 2600hrs in, and near 60% complete, it has not issued a trickle since 9 Nov. It used to do so about once per day.
The globe is entirely blue.
I have suspended the project.
I ran prime95 on both processors for over a day with no errors reported.
machine is 3GHz dual proc Prescott P4. no overclocking.

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6584851

If I do not hear otherwise within a few days from the forum, I will abort the model run.


ID: 31467 · Report as offensive
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 31468 - Posted: 23 Nov 2007, 23:14:41 UTC - in response to Message 31467.  

Your model sounds like it needs to die. I wish the models would realize that -50 C is too cold at the equator and abort itself. It will do that for negative barometric pressure, so why not temperature? There is a *slight* possibility that these models are predicting an ice age, but probably not.
ID: 31468 · Report as offensive
ProfileIain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 31469 - Posted: 24 Nov 2007, 0:24:49 UTC - in response to Message 31468.  

[DJStarfox wrote:]... I wish the models would realize that -50 C is too cold at the equator and abort itself. ...

I\'ll second that! I had another one a few days ago: I have to say I\'ve lost patience with them now: unless they\'re within a trickle or two of the end of phase, they\'re gone. What\'s particularly irritating is when another model in the same work unit has passed the freeze point: e.g. here, where a Linux machine seemed to have no problem - the data looks fine. I know all the explanations about run-time libraries etc., but it just doesn\'t feel right. Grrrr.
ID: 31469 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 31470 - Posted: 24 Nov 2007, 1:33:48 UTC


2 of the slabs on my quad uploaded phase 3 OK 20 minutes ago.
1 more in 15 minutes, and the 4th 30 minutes after that.

ID: 31470 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 31471 - Posted: 24 Nov 2007, 3:30:33 UTC


All 4 slab models now uploaded OK.

ID: 31471 · Report as offensive
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 31472 - Posted: 24 Nov 2007, 14:20:00 UTC - in response to Message 31466.  
Last modified: 24 Nov 2007, 14:20:54 UTC

Strange....The model got past 66% just fine, uploaded the zip file, and is working on phase 3 now. What\'s really weird is that the global looks normal again! Also, the temperature graph for P2 looks normal too; the precipitation graph falls off halfway through phase 2.

I\'m talking about this result.
ID: 31472 · Report as offensive
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 31475 - Posted: 24 Nov 2007, 20:52:31 UTC


If I recall correctly, the second and third phases start with the same climate, but one has a higher CO2 level than the other.

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 31475 · Report as offensive
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 31476 - Posted: 24 Nov 2007, 22:35:00 UTC - in response to Message 31475.  


If I recall correctly, the second and third phases start with the same climate, but one has a higher CO2 level than the other.


Well, at this point, I am going to let it finish, unless someone recommends aborting it. I don\'t want to waste CPU cycles if it\'s a broken model.
ID: 31476 · Report as offensive
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 31490 - Posted: 26 Nov 2007, 8:39:37 UTC - in response to Message 31476.  
Last modified: 26 Nov 2007, 8:40:34 UTC

If I recall correctly, the second and third phases start with the same climate, but one has a higher CO2 level than the other.

You recall correctly Mike.
Well, at this point, I am going to let it finish, unless someone recommends aborting it. I don\'t want to waste CPU cycles if it\'s a broken model.

Your result\'s phase 2 graphs show normal temperature but precipitation goes off the scale in summer 1833. If the problem is related to the model parameters I\'d expect things to start going awry in summer 2058.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 31490 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 31494 - Posted: 26 Nov 2007, 19:48:32 UTC


One of my 2nd batch of slabs turned blue, so I aborted it.

ID: 31494 · Report as offensive
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 31536 - Posted: 29 Nov 2007, 22:19:42 UTC - in response to Message 31475.  
Last modified: 29 Nov 2007, 22:20:10 UTC

Your result\'s phase 2 graphs show normal temperature but precipitation goes off the scale in summer 1833. If the problem is related to the model parameters I\'d expect things to start going awry in summer 2058.


OK, you\'re not going to believe this. Remember how I switched my AMD PowerNow settings from Conservative to OnDemand (cpuspeed in linux)?

My slab model is at April 2060, but the world just looks a little warm. It\'s not the ice ball that it was at this time in phase 2. Weird?

The model will be done in 3 days (real time), so we\'ll see if it makes it to the end soon enough.
ID: 31536 · Report as offensive
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 31561 - Posted: 2 Dec 2007, 1:00:13 UTC
Last modified: 4 Jan 2009, 1:49:23 UTC

Your temperature graph for phase 2 looks normal so I wonder whether it was just a graphics problem?

This is what happened to the model\'s phase 2 precipitation.



If the model does complete, it will be interesting to see its phase 3 graphs.
Cpdn news
ID: 31561 · Report as offensive
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 31569 - Posted: 2 Dec 2007, 23:23:17 UTC - in response to Message 31561.  
Last modified: 2 Dec 2007, 23:24:21 UTC

Your temperature graph for phase 2 looks normal so I wonder whether it was just a graphics problem?

This is what happened to the model\'s phase 2 precipitation.
(snip)

If the model does complete, it will be interesting to see its phase 3 graphs.


That model just reported because it finished 100%. Precipitation graph goes up but otherwise looks normal.
ID: 31569 · Report as offensive
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 31572 - Posted: 3 Dec 2007, 1:11:56 UTC

Yes, it corrected itself in Phase 3:



Well done!
Cpdn news
ID: 31572 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 15 · Next

Message boards : Number crunching : Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion

©2024 cpdn.org