climateprediction.net (CPDN) home page
Thread 'Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion'

Thread 'Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion'

Message boards : Number crunching : Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 15 · Next

AuthorMessage
old_user511233

Send message
Joined: 7 Apr 08
Posts: 4
Credit: 28,086
RAC: 0
Message 33281 - Posted: 10 Apr 2008, 20:41:13 UTC - in response to Message 33280.  

I\'m no expert, Richard, but it looks as if you\'ve downloaded two models instead of one - that would probably slow things up quite a lot? http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=854815



Using Process Exploerer I see only one task getting 100% cpu time.
ID: 33281 · Report as offensive
ProfileIain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 33282 - Posted: 10 Apr 2008, 20:43:08 UTC
Last modified: 10 Apr 2008, 20:44:42 UTC

Richard,

There are currently three types of model on offer: HADSM3 (\'slab\', 45 years), HADCM3 (\'coupled\', 160 years), HADAM3 (regional, 1 year). A slab model will take about three weeks or so to complete, the HADAM3 model slightly less, and the coupled model much longer (three months or more, depending on machine and hours running).

The model type is selectable from your account, which you can get to by clicking on the \'Your account\' menu item to the left of this page.

Your PC has downloaded two models so far:

1. hadcm3istd_0jgm_1920_160_05940232_1, which is a HADCM3 coupled model.

2. hadam3h_n_175s1_005d_005d_0_1, which is a HADAM3 regional model.

If the PC isn\'t going to be around very long, then you might as well abort the HADCM3 model and leave the HADAM3 running (if it\'s still there).

Then, if you change your preferences to exclude further HADCM3 models, you could easily run a mix of slabs and regional models until the computer is no longer available.

If you have any further questions then just ask - someone will answer eventually.

Iain
ID: 33282 · Report as offensive
old_user511233

Send message
Joined: 7 Apr 08
Posts: 4
Credit: 28,086
RAC: 0
Message 33283 - Posted: 10 Apr 2008, 20:46:11 UTC - in response to Message 33282.  

Richard,

There are currently three types of model on offer: HADSM3 (\'slab\', 45 years), HADCM3 (\'coupled\', 160 years), HADAM3 (regional, 1 year). A slab model will take about three weeks or so to complete, the HADAM3 model slightly less, and the coupled model much longer (three months or more, depending on machine and hours running).

The model type is selectable from your account, which you can get to by clicking on the \'Your account\' menu item to the left of this page.

Your PC has downloaded two models so far:

1. hadcm3istd_0jgm_1920_160_05940232_1, which is a HADCM3 coupled model.

2. hadam3h_n_175s1_005d_005d_0_1, which is a HADAM3 regional model.

If the PC isn\'t going to be around very long, then you might as well abort the HADCM3 model and leave the HADAM3 running (if it\'s still there).

Then, if you change your preferences to exclude further HADCM3 models, you could easily run a mix of slabs and regionsl models until the computer is no longer available.

If you have any further questions then just ask - someone will answer eventually.

Iain



Thanks for all the quick responses guys.

I checked the BOINC Manager task list. It shows only the coupled model and nothing else. Where is the regional model hiding? Sorry that I don\'t know much about this process and the times for the different models.
ID: 33283 · Report as offensive
ProfileIain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 33284 - Posted: 10 Apr 2008, 20:49:17 UTC - in response to Message 33283.  

I checked the BOINC Manager task list. It shows only the coupled model and nothing else. Where is the regional model hiding? Sorry that I don\'t know much about this process and the times for the different models.

If it\'s not showing in BOINC Manager then it may have crashed, but hasn\'t yet reported on the Web site.

I would:

a) select the preferred model type in the preferences (start off with HADSM3, slab)

b) abort the HADCM3

c) press the \'update\' button in BOINC Manager.
ID: 33284 · Report as offensive
old_user511233

Send message
Joined: 7 Apr 08
Posts: 4
Credit: 28,086
RAC: 0
Message 33285 - Posted: 10 Apr 2008, 20:51:19 UTC - in response to Message 33284.  

I checked the BOINC Manager task list. It shows only the coupled model and nothing else. Where is the regional model hiding? Sorry that I don\'t know much about this process and the times for the different models.

If it\'s not showing in BOINC Manager then it may have crashed, but hasn\'t yet reported on the Web site.

I would:

a) select the preferred model type in the preferences (start off with HADSM3, slab)

b) abort the HADCM3

c) press the \'update\' button in BOINC Manager.



Will do.
Thanks all!
ID: 33285 · Report as offensive
ProfileStrathpeffer
Avatar

Send message
Joined: 9 Jan 07
Posts: 497
Credit: 342,899
RAC: 0
Message 33286 - Posted: 10 Apr 2008, 20:53:27 UTC

From the link I posted earlier, looks like he\'s downloaded yet another model - Richard, you also need to hit the \"No new tasks\" button.
Visit the Scotland team
ID: 33286 · Report as offensive
ProfileIain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 33287 - Posted: 10 Apr 2008, 20:54:35 UTC - in response to Message 33285.  
Last modified: 10 Apr 2008, 20:59:20 UTC

Will do.
Thanks all!

Looks like you got another coupled model. You have to change the preferences before aborting any running models or pressing update, otherwise you\'ll get a lucky-dip model!

Best of luck.

[And Strathpeffer\'s right: pressing the \'no new tasks\' button gives you better control over what comes down the line; if you do that, then the button changes to \'allow new tasks\' so that a press then gets you a new model if you need one. It all makes sense in the end ...]
ID: 33287 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 33289 - Posted: 11 Apr 2008, 15:56:09 UTC


Richard

You\'re said that you can only see one model in Task Manager.
This may just be because you only have a single processor selected in your preferences.

So, if you have 2 models running, they will alternate. The only place where you can see how many models you have is in the Tasks tab of the BOINC manager.


Backups: Here
ID: 33289 · Report as offensive
Profileold_user280873
Avatar

Send message
Joined: 18 Feb 06
Posts: 17
Credit: 1,769,142
RAC: 0
Message 34102 - Posted: 19 Jun 2008, 8:26:07 UTC

I have problems with icewolds in HADAM3 5.03 models.
I aborted 3 models after a few hours of running.
The fourth stopped by a calculation error.
The task ID of the last model is: 7928832
The models were not constant blue. s/TS varied between 17 and 1100.
The processor is Intel duo core 2.4 GHz; no overclocking.
On the other core a HadSM3 model runs smoothly at 1,44 s/TS; completion is 29% there.

Is the problem caused by my computer or are the models the cause?

I do not allow new climateprediction models for the time being.

Advise appreciated,

Leendert from The Netherlands.
ID: 34102 · Report as offensive
ProfileIain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 34104 - Posted: 19 Jun 2008, 11:18:54 UTC

Leendert,

Only one of the HADAM3 models gives a useful error message (i.e. 7928832). That message suggests a memory allocation problem. The computer has lots of memory, so perhaps something is preventing the HADAM3 from getting the memeory it needs: for example, the virtual memory may be limited. Is the disk full?

Iain
ID: 34104 · Report as offensive
Profileold_user280873
Avatar

Send message
Joined: 18 Feb 06
Posts: 17
Credit: 1,769,142
RAC: 0
Message 34106 - Posted: 19 Jun 2008, 12:26:51 UTC - in response to Message 34104.  

Thanks Iain,
There is 113 GB free on the disk. Seems to be enough to me.
Vista advices 3000 MB virtual memory; it was 2000 MB in auto mode. I changed to manual and enlarged it to 3000 MB and wait for new results tomorow as the server message said: reached daily quota.

Maybe the problem has to do with the other \'activities\' on my PC which use quite some memory: Realtime stock market analysing programs, Dreamweaver and Photoshop. However these programs and boinc run for years together on my pc.

Leendert.
ID: 34106 · Report as offensive
old_user22557

Send message
Joined: 3 Oct 04
Posts: 2
Credit: 267,656
RAC: 0
Message 34386 - Posted: 24 Jul 2008, 18:18:09 UTC

Hi I got my first iceworld it seems.

1. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7533516
2. Timestep 81433 - 259248
3. s/TS value 0.98
4. Blue ice world when viewing the globe / temperature (CTRL+T) mode.
5. Intel Core 2 Duo E8500 (3.16GHz. running at 3.5GHz.)

Around 76% give or take it began to run extremely slow. Up to a certain point then this model and another model downloaded at the same time ran almost at the same speed, but this one slowed down a lot, while the other one finished this morning.

More info needed? Should I kill the slow model or not? I have not tried to lowering the overclock, but there\'s been absolutely no stability problems, and it is very light overclocking only.
ID: 34386 · Report as offensive
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34389 - Posted: 24 Jul 2008, 20:34:57 UTC

I\'ve moved Brave Daun\'s post to this thread because I don\'t think his problem is about iceworlds.

I\'ll leave TheWiz\'s problem to someone who also knows about about stability and overclocking, just in case.
Cpdn news
ID: 34389 · Report as offensive
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 34391 - Posted: 25 Jul 2008, 3:26:58 UTC - in response to Message 34386.  

Hi I got my first iceworld it seems.

1. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7533516
2. Timestep 81433 - 259248
3. s/TS value 0.98
4. Blue ice world when viewing the globe / temperature (CTRL+T) mode.
5. Intel Core 2 Duo E8500 (3.16GHz. running at 3.5GHz.)

Around 76% give or take it began to run extremely slow. Up to a certain point then this model and another model downloaded at the same time ran almost at the same speed, but this one slowed down a lot, while the other one finished this morning.

More info needed? Should I kill the slow model or not? I have not tried to lowering the overclock, but there\'s been absolutely no stability problems, and it is very light overclocking only.

Sure sounds like an iceworld. Sometimes when problems occur, the model will rewind a day/month/year before giving up (and this will increase the s/TS), but with the speed of your computer, it would have already gone through the year rewind, so that can\'t be the reason for the slowing of sec/timestep. Blue globe in this case = iceworld.

Sometimes these are just due to parameters of the model. Othertimes it\'s the computer. If you get quite a few iceworlds as you run along, it may be worth trying to decrease the overclock.
ID: 34391 · Report as offensive
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34393 - Posted: 25 Jul 2008, 10:54:25 UTC

This is the workunit the problem model belongs to:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6168545

Three other computers are crunching the same model, all much less advanced than TheWiz. It would be a good idea to look at this workunit again in two or three weeks to see whether any of the other computers pass the trickle point where TheWiz\'s model developed the problem. If the other models develop the same problem it will be a defective model. If the other models trickle normally past that point, TheWiz will need to investigate his computer\'s stability.
Cpdn news
ID: 34393 · Report as offensive
old_user22557

Send message
Joined: 3 Oct 04
Posts: 2
Credit: 267,656
RAC: 0
Message 34394 - Posted: 25 Jul 2008, 15:34:09 UTC - in response to Message 34393.  

Hi and thanks for the replies.

Does this mean I should await how things are with the other computers in 2 - 3 weeks, or should I abort it now?

Also if the problem is due to overclocking does it fix things with the problematic model to decrease of remove the overclocking?
ID: 34394 · Report as offensive
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34395 - Posted: 25 Jul 2008, 16:11:59 UTC

Hi again

You\'ve already seen the abnormal graphics and slowdown for this model. The abnormal monochrome graphics indicate abnormal processing. This model will have tried to recover but if your graphics are still abnormal, it can\'t. If you let it continue we know that it will not produce good data for the scientists. So you\'ll have to abort it.

If you have a backup of the complete contents of your BOINC folder from before this model became abnormal you could restore it, reduce the overclock or return the computer to stock speed, then see whether the model processes normally. If it becomes abnormal again at the same model date this will indicate a defective model (initial parameter values that don\'t work successfully in combination). But if it continues and processes normally past the problem date, this would indicate that your computer\'s stability is the problem.

If you have no backup, just abort the model now but check your future models regularly for possible abnormalities. And in a week or two we should check the other models in this workunit again because what happens to them may help you diagnose whether you have a stability problem or the model parameter values were unviable.

You could of course if you prefer run the stability tests now. In the README collection about running the model (link to the READMEs in my signature) there\'s a post by UKNick about hardware testing. But you\'d still have to abort this bad model, sorry. I hope your next model is a good one. Most are good.
Cpdn news
ID: 34395 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 34442 - Posted: 31 Jul 2008, 15:48:34 UTC

Back to back iceballs 7537005 and 7553089. Blue globe, slow down, all the usual.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 34442 · Report as offensive
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34443 - Posted: 31 Jul 2008, 17:35:33 UTC

Thanks for the links, Adrian. I can see you\'ve diagnosed them and then aborted them without wasting more time. I think another cruncher\'s computer is currently stuck at the same point as one of them, and there are several other members I will also send a private message to.

I\'m particularly interested in this model which has got past the point where your second model became an iceball. It\'s also on a C2Q but the French cruncher has Linux. Just look at the speed of that model. I\'m wondering whether it\'s so much faster because of the Linux or because the computer may be O/C\'d, or a combination of both. I would like a few more opinions about this model please!

Adrian, could I just ask you a couple of questions please. I\'ll wait for your answers before I send any PMs.

* Both models are in fact on the same quad?

* Is this computer running at stock speed or overclocked?

Thanks for reporting these models.
Cpdn news
ID: 34443 · Report as offensive
old_user1132

Send message
Joined: 25 Aug 04
Posts: 28
Credit: 6,522,252
RAC: 0
Message 34456 - Posted: 1 Aug 2008, 11:30:45 UTC - in response to Message 34443.  

Just look at the speed of that model. I\'m wondering whether it\'s so much faster because of the Linux or because the computer may be O/C\'d, or a combination of both. I would like a few more opinions about this model please!


That speed looks feasible for a model on a Q6600 running Over-clocked at ca 3.4GHz with a faster than average parameter set.

Andrew
ID: 34456 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 15 · Next

Message boards : Number crunching : Iceworlds & Slowdowns hadsm3/mh - Closed - Discussion

©2024 cpdn.org