Message boards : Number crunching : Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 15 · Next
Author | Message |
---|---|
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The good news: It\'s running! It\'s apparently gone from a countdown of 270 through zero and back to 320, including updating client_state.xml The last thing that I tried was changing what was being displayed. Some of the time it took a while to show the new parameter, and it was helpful to have the left hand overlay visible, so that I could see what one was current. The bad news: It seems to be looping. It\'s back at the same day / timestep as before. More later. |
Send message Joined: 8 Aug 05 Posts: 9 Credit: 46,744 RAC: 0 |
DKR, it looks as if your model has slowed right down, so much that it hasn\'t trickled for days, since 16 Oct. The 1.78 timestep is a cumulative figure - it\'s probably much slower than that now. If this really is the case, I would abort it as it still has quite a few years left to crunch. Ok, thanks for replying. I\'ve had nothing but bad results with CPDN on BOINC which is a shame because it is a cause I really want to help. I don\'t feel I can take any more work off it now though because something always goes wrong. All that crunching for no/little scientific worth. Of course all the time BOINC is running it is a 20 W overhead on the power consumption on my PC. Not good. Running work for CPDN has been counter productive in my case. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hang on a sec - I\'ve looked at your results for that computer and I think you\'re probably feeling unnecessarily disappointed because this model has turned out to be a bad apple. Your 8 results in time order: #1 Old-type slab, completed #2 HadCM 160-year. It crashed with a -107 error, probably a graphics problem on your computer. In the README about avoiding crashes, item #5 by Mike advises how to avoid this sort of error. With a backup it could in any case have been rescued and completed. #3 Your computer spent no time on it. #4 HadCM 160-year. Did 81 years so it sent in the year 2000 restart dump and is quite likely being completed on another computer. All the data for 80 years sent to Oxford is used by the researchers. It crashed with a -161 error and could probably also have been saved by restoring a backup. #5 Downloading error so no computer time lost. #6 You aborted it, so no computer time lost. Then there are your two current models, one of which you\'ve taken as far as you could. I expect the data for the first 2 phases will be used for the research. Your other model is doing well. With models as long as these, the probability of something going wrong momentarily on the computer and crashing it is quite high. Eg just forgetting to suspend it before playing a game, or forgetting to exit from boinc before an AV scan or turning the computer off. The way to reach the end of most models is to take regular backups. In the READMEs there\'s a selection of methods. But a backup wouldn\'t have saved your model that\'s slowed down - something\'s wrong with the model itself. In any case, you\'ve done much more useful work for the project than you\'re giving yourself credit for. Cpdn news |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
DKR The problem with your first 3 crashes is probably because you were still using BOINC version 4.45, and the Coupled Ocean models need a version 5.* to work properly. Backups: Here |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Furthur checking showed that the model was indeed looping, from 9/6/2078 to 7/6/2078, and from a countdown of 261 straight back to 380. This was with an 80 year Coupled Ocean model, not a slab model, but it may well be similar for slabs. The display was a \'blue\' or \'white\' world, both temperature maps. The cpu time in the Tasks tab was increasing normally, but the interval of timesteps was around 15 minutes. VERY slow. Also, as mentioned before, the model had slowed dramatically a couple of weeks ago, from a trickle about every 15 hours to about one every 29 hours. Model was backed up and then suspended, pending furthur possible need. |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Found: One Blue Planet WU: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6100707 Click on result 6938595 for computer 531684. It\'s a slab model in phase 2. Note that phase 1 completed normally. Since I\'m at 62% done, I was going to let it complete phase 2 (66%) before aborting it. I completed a previous slab model recently with no problems. Is that the right time to abort it? Edit: One thing I forgot to mention: I started running the \"Conservative\" CPUSpeed governor. As a result, sometimes BOINC runs benchmarks when the CPU is running on slow speed (but before the governor increases the speed). This causes my benchmarks to be very low. Any relation here? |
Send message Joined: 28 Aug 05 Posts: 1 Credit: 422,206 RAC: 0 |
Appears that I have one of these faulty models making no progress. About 2600hrs in, and near 60% complete, it has not issued a trickle since 9 Nov. It used to do so about once per day. The globe is entirely blue. I have suspended the project. I ran prime95 on both processors for over a day with no errors reported. machine is 3GHz dual proc Prescott P4. no overclocking. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6584851 If I do not hear otherwise within a few days from the forum, I will abort the model run. |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Your model sounds like it needs to die. I wish the models would realize that -50 C is too cold at the equator and abort itself. It will do that for negative barometric pressure, so why not temperature? There is a *slight* possibility that these models are predicting an ice age, but probably not. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
[DJStarfox wrote:]... I wish the models would realize that -50 C is too cold at the equator and abort itself. ... I\'ll second that! I had another one a few days ago: I have to say I\'ve lost patience with them now: unless they\'re within a trickle or two of the end of phase, they\'re gone. What\'s particularly irritating is when another model in the same work unit has passed the freeze point: e.g. here, where a Linux machine seemed to have no problem - the data looks fine. I know all the explanations about run-time libraries etc., but it just doesn\'t feel right. Grrrr. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
2 of the slabs on my quad uploaded phase 3 OK 20 minutes ago. 1 more in 15 minutes, and the 4th 30 minutes after that. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
All 4 slab models now uploaded OK. |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Strange....The model got past 66% just fine, uploaded the zip file, and is working on phase 3 now. What\'s really weird is that the global looks normal again! Also, the temperature graph for P2 looks normal too; the precipitation graph falls off halfway through phase 2. I\'m talking about this result. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
If I recall correctly, the second and third phases start with the same climate, but one has a higher CO2 level than the other. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Well, at this point, I am going to let it finish, unless someone recommends aborting it. I don\'t want to waste CPU cycles if it\'s a broken model. |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
If I recall correctly, the second and third phases start with the same climate, but one has a higher CO2 level than the other. You recall correctly Mike. Well, at this point, I am going to let it finish, unless someone recommends aborting it. I don\'t want to waste CPU cycles if it\'s a broken model. Your result\'s phase 2 graphs show normal temperature but precipitation goes off the scale in summer 1833. If the problem is related to the model parameters I\'d expect things to start going awry in summer 2058. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
One of my 2nd batch of slabs turned blue, so I aborted it. |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Your result\'s phase 2 graphs show normal temperature but precipitation goes off the scale in summer 1833. If the problem is related to the model parameters I\'d expect things to start going awry in summer 2058. OK, you\'re not going to believe this. Remember how I switched my AMD PowerNow settings from Conservative to OnDemand (cpuspeed in linux)? My slab model is at April 2060, but the world just looks a little warm. It\'s not the ice ball that it was at this time in phase 2. Weird? The model will be done in 3 days (real time), so we\'ll see if it makes it to the end soon enough. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Your temperature graph for phase 2 looks normal so I wonder whether it was just a graphics problem? This is what happened to the model\'s phase 2 precipitation. If the model does complete, it will be interesting to see its phase 3 graphs. Cpdn news |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Your temperature graph for phase 2 looks normal so I wonder whether it was just a graphics problem? That model just reported because it finished 100%. Precipitation graph goes up but otherwise looks normal. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
|
©2024 cpdn.org