climateprediction.net (CPDN) home page
Thread 'Iceworld (HadSM and HadSM MH) discussion'

Thread 'Iceworld (HadSM and HadSM MH) discussion'

Message boards : Number crunching : Iceworld (HadSM and HadSM MH) discussion
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 39383 - Posted: 27 Mar 2010, 0:02:15 UTC

As the previous discussion was very long I have started this new thread. Please report \'iceworlds\' and ask for advice here.

The iceworld phenomenon is described by Geophi here.
Cpdn news
ID: 39383 · Report as offensive     Reply Quote
ProfileOvertonesinger

Send message
Joined: 30 Dec 05
Posts: 5
Credit: 986,440
RAC: 0
Message 39435 - Posted: 1 Apr 2010, 7:59:14 UTC

Yes, I have the symptom of slooooooowness at my one current workunit.:
It slowed down at about 90.650 % completed. Now I think it will take about 110 days of CPU time of 2.2 Ghz DualCore Pentium to complete. :)

What to do with it? Shall I let it run this slowly? Is it still worth something for the project or is it a bug? :O
There is already ONE successful computation for this W.U. already... :o)
Is is good if there will be two - to confirm the result?


Application: UK Met Office HadSM3 Slab Model 6.07


http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10122920


with greetings
Overtonesinger
ID: 39435 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,900,756
RAC: 2,130
Message 39439 - Posted: 1 Apr 2010, 11:18:00 UTC
Last modified: 1 Apr 2010, 11:19:10 UTC

Overtonesinger,

Welcome to the CPDN BOINC message board!

The model that completed from this work unit was running on a Mac - an \'iceworld\' on one type of computer is not generally an iceworld on a different type.

My advice for iceworlds is that unless the iceworld is very near the end of phase 2 or phase 3 then it is not worth continuing. The computer\'s time would be better spent processing another model. Having said that, some people do finish their iceworlds and the model will complete as normal, uploading phase-end Zip files - it will just take an enormous amount of time.

Iain
ID: 39439 · Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 24 Sep 05
Posts: 7
Credit: 3,546,691
RAC: 2,831
Message 39442 - Posted: 1 Apr 2010, 18:54:01 UTC

looks like I got my second iceworld.
hadsm3mh_kwei_006490926_5
resultid=10546498

last tickle was on 11 Mar 2010 23:06:52
It shows a blue temperature world and looks like it works very slow.

Phase 4
Timestep 162,030
CPU Time (sec) 1,161,233

The time steps are growing, but very slow.
I don\'t have a backup from the state before it turns to an Icewold.

If its useful for you, I would let it run. Its the last phase of the run for this result (4 of 4).
Matthias
ID: 39442 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 39444 - Posted: 1 Apr 2010, 19:29:11 UTC

Hi Matthias

Here\'s its workunit. Two computers have completed a task from this WU, but one has a different processor, AMD, and the other has a different OS, Darwin. Another computer is stuck at exactly the same point as you; he has the same combination of processor + OS.

The data from the first three phases is good but if you complete Phase 4, data will be incomplete on one graph or both.

Nobody should restore a backup to save an iceworld unless you think your computer was unstable and you\'ve fixed the problem or you want to send a graphics recording of the iceworld moment to Iain Inglis. The iceworld will develop again at the same processing moment.

Please abort the task. Thank you for the report.
Cpdn news
ID: 39444 · Report as offensive     Reply Quote
MartinNZ

Send message
Joined: 22 Mar 06
Posts: 144
Credit: 24,695,428
RAC: 0
Message 39468 - Posted: 4 Apr 2010, 4:54:33 UTC
Last modified: 4 Apr 2010, 4:55:50 UTC

Hi,

Looks as though I\'ve another iceworld.


1. Task ID 11367543
2. Current timestep: 118695 of 259248
3. The s/TS value: 1.59
4. Whether the temperature display of the globe graphic is blue: Yes
5. What your processor/CPU and Operating System: Intel i7 920, Win7 64bit
6. Whether you are overclocking: No


Last trickle was 31 March and time to completion is increasing.

I can try and restore the task from the backup as per Les\'s instructions, and get a graphics recording to Iain if you wish. Although I had problems with this before, intrinsically there is no reason it should not work, especially as I\'ve completely restored the OS hard drive on this PC with no issues.

Currently 6 tasks running, and I could let all those complete, but that would be another 246hrs.

Let me know what do you wish me to do with it.

(BTW, my Famous is going famously!)

Regards, Martin
ID: 39468 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,900,756
RAC: 2,130
Message 39470 - Posted: 4 Apr 2010, 10:28:25 UTC - in response to Message 39468.  

Let me know what do you wish me to do with it.

I\'m still collecting them, so please do if it\'s not too complicated. Be aware that if you restore a backup and connect it to the Internet then the CPDN BOINC Web site will mark the running models as \'client detached\'. This is purely consmetic - the server will happily accept any trickles and Zip files (including repeats).

If your backup is quite close to the freeze point, then you could do the following:

1. Make a backup of where you are now (call this CURRENT).

2. Restore the pre-freeze backup and turn network activity off immediately. (I suspend all tasks and turn network activity off before backing up to make things simple.)

3. Run only the slab model that will freeze (with recording on).

4. Collect a few \'.cpdn\' files around the freeze.

5. Restore your CURRENT backup and carry on.

That way you\'ll lose the minimum time and the Web site won\'t notice that a backup has been restored.

Iain
ID: 39470 · Report as offensive     Reply Quote
JoeyJoJo

Send message
Joined: 31 May 09
Posts: 2
Credit: 896,678
RAC: 0
Message 39538 - Posted: 12 Apr 2010, 22:05:07 UTC - in response to Message 39470.  

I\'ve been running a model like this for some time. I thought it was normal. I haven\'t got any backups unfortunately.

Here\'s the results
Timestep 176893 of 259248
6.48 s/TS
All blue.
Intel E4400, Windows Vista Home Premium x64
Not overclocked

Now aborted and running something else. Good luck
ID: 39538 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 39539 - Posted: 12 Apr 2010, 22:32:12 UTC

Thank you for reporting this. I see that another member in the same workunit has also aborted the model.

But a third member is still running his model which has also slowed down. This is bad news because it\'s the unfortunate member\'s first model. He probably doesn\'t know whether the interminable slowness and blue graphics are normal or not.

I shall see whether Milo can now send an \'iceworld email\' to this member.
Cpdn news
ID: 39539 · Report as offensive     Reply Quote
ProfileOvertonesinger

Send message
Joined: 30 Dec 05
Posts: 5
Credit: 986,440
RAC: 0
Message 39549 - Posted: 14 Apr 2010, 8:48:13 UTC

Hello,

My glacius (Ice world) is as follows:

* link: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10122920

* A current timestep of that model: 194912 / 259248
* The s/TS value: 3.07 (but this is average!, 2.1 before slowdown at 91%)
(actually it now takes about 48 seconds per TS! )
* Temperature display of the globe graphic is blue: YES
* CPU Intel E2200 (Pentium Dual CPU) at 2.20 Ghz, Windows XP SP3 (32-bit).
* Overclocking: NO (I cannot... it is my WORK-PC at my job place)

ID: 39549 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 39587 - Posted: 19 Apr 2010, 3:01:32 UTC

Hi Overtonesinger

You should abort your iceworld please. When this happens the model tries to produce more good data but cannot.

(Your website is interesting!)
Cpdn news
ID: 39587 · Report as offensive     Reply Quote
old_user5681

Send message
Joined: 31 Aug 04
Posts: 42
Credit: 547,031
RAC: 0
Message 39593 - Posted: 19 Apr 2010, 18:08:43 UTC
Last modified: 19 Apr 2010, 18:22:39 UTC

Hi
One of my models has turned into an iceworld. The speed of T/S has actually increased by around one second:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11268403

I\'m quite happy to let it run if its going to be of use.

FWIW, I snoozed then exited Boinc to unistall avast anti virus and install Microsoft security essentials. It was after the restart I noticed the change.

Is it worth keeping?
ID: 39593 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 39594 - Posted: 19 Apr 2010, 19:15:15 UTC - in response to Message 39593.  
Last modified: 19 Apr 2010, 19:20:19 UTC

Hi
One of my models has turned into an iceworld. The speed of T/S has actually increased by around one second:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11268403

I\'m quite happy to let it run if its going to be of use.

FWIW, I snoozed then exited Boinc to unistall avast anti virus and install Microsoft security essentials. It was after the restart I noticed the change.

Is it worth keeping?


You don’t say where you are in the model. Unless it is very near the end it will take months to finish.

The 1 second increase in the s/TS is only the beginning. Because it is a long term average it will continue to increase, taking a long time to reflect the real time each timestep is taking.
ID: 39594 · Report as offensive     Reply Quote
old_user5681

Send message
Joined: 31 Aug 04
Posts: 42
Credit: 547,031
RAC: 0
Message 39595 - Posted: 19 Apr 2010, 19:30:24 UTC
Last modified: 19 Apr 2010, 19:31:10 UTC

Hi Jim,

Sorry, I mean\'t an increase in the speed as shown in my link, from 0.97 to around 0.87 T/s. The computation is speeding up. But yes I suppose you could call it a decrease as well...you can tell I\'m no scientist :)
As shown by the trickles, this started at the beginning of phase two, or to be precise, a minute before the second trickle was due to be uploaded.
ID: 39595 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,900,756
RAC: 2,130
Message 39596 - Posted: 19 Apr 2010, 20:07:32 UTC

Martin,

An iceworld on a Windows/AMD machine such as yours will speed up (as do Linux/AMD and Mac) - only Windows/Intel slows down.

Your model will recover at the end of phase 2 and carry on as normal. You might as well finish it ...

Iain
ID: 39596 · Report as offensive     Reply Quote
old_user5681

Send message
Joined: 31 Aug 04
Posts: 42
Credit: 547,031
RAC: 0
Message 39597 - Posted: 19 Apr 2010, 20:15:42 UTC - in response to Message 39596.  




Thanks, I\'ll let it run its course.
ID: 39597 · Report as offensive     Reply Quote
old_user5480

Send message
Joined: 31 Aug 04
Posts: 3
Credit: 318,314
RAC: 0
Message 40882 - Posted: 19 Oct 2010, 13:20:39 UTC

Hi there, it seems I've got one also. Ice world model seems to have frozen (no progress):

1. Task ID 11225288 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11225288
2. Current timestep: 172208 of 259248 - Phase 3 of 3 Progress 88.81%
3. The s/TS value: 2.58
4. Whether the temperature display of the globe graphic is blue: Yes
5. What your processor/CPU and Operating System: Intel i5 , Win XP 32bit
6. Whether you are overclocking: No


Last trickle was 15 Oct 2010 11:38:58 and time to completion is increasing.

Should I abort it? There seems to be no progress anyway...
ID: 40882 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 40883 - Posted: 19 Oct 2010, 16:30:02 UTC - in response to Message 40882.  

Last trickle was 15 Oct 2010 11:38:58 and time to completion is increasing.

Should I abort it? There seems to be no progress anyway...

Yes. You have diagnosed this correctly and it's best to abort this model and download a new one.
ID: 40883 · Report as offensive     Reply Quote
Hans-Henrik Husen

Send message
Joined: 7 Sep 09
Posts: 2
Credit: 13,113,974
RAC: 0
Message 40918 - Posted: 26 Oct 2010, 15:14:46 UTC

I'm running the hadsm3fub_jwxd_006453195_1 and have been for nearly a year. It started out as a 250 hour project, but since it reached about 93% completion, it continues at a much slower rate. The project should have stopped in year 2050, but has continued to its present year, 2062- ! The elapsed time is now 788 hours and increasing, but the remaining time is 55 hours - and that is also increasing! If this continues, it is a never ending project! Does anybody have an explanation?
ID: 40918 · Report as offensive     Reply Quote
Urglab

Send message
Joined: 27 Feb 08
Posts: 4
Credit: 960,510
RAC: 0
Message 40919 - Posted: 26 Oct 2010, 15:48:34 UTC
Last modified: 26 Oct 2010, 15:49:04 UTC

Hi, I just noticed one of my tasks turned snowball too. Progress is at 29.43%

Here are the trickles. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/trickle.php?resultid=10989370

The last trickle was from the 22nd but I was away during the weekend so my pc wasn't running. Besides this project I'm running quite a few others so boinc gets to run the models off and on.

To summarize:

1. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10989370
2. Current timestep: 228904 of 259248 (phase 1 of 3)
3. The s/TS value: 1.43 (but really slow now)
4. Whether the temperature display of the globe graphic is blue: Yes
5. What your processor/CPU and Operating System: Intel i7 870, Win7 64bit
6. Whether you are overclocking: No
ID: 40919 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Iceworld (HadSM and HadSM MH) discussion

©2024 cpdn.org