Message boards : Number crunching : Iceworld Appeal
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next
Author | Message |
---|---|
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
I could buy another screen and put the four AMD models on it so when I am working I would see a bluey when it happens. Belt and Braces, I know.The graphics slow down the processing a lot (~50%) whereas, oddly, the recording does not. So any method that avoids actually displaying the graphics will keep your processing rate up. Once started, the recording continues until stopped, whether the graphics are showing or not. Great, so if they overwrite I don\'t need to delete, after each phase, as I should not exceed 8 * ~30GB (240GB)of data. Is that a correct understanding?Yes, that\'s how it works. This also brings up another question. Do the .tmp files stay after the model ends?BOINC seems to tidy up when the model finishes normally or is aborted. (I don\'t know what happens after a random crash, since I don\'t have them - some model types certainly used to leave debris when they crashed, but the up-to-date versions may be tidier.) A set of Web pages has been set up to track your AMD models (the Intel models will slow down so much you\'re bound to notice them). It\'s here - there are \'previous\' and \'next\' links at the bottom of the page. The pages are on a scheduled task list to be updated at 18:15 UTC each day. On an AMD, you\'re looking for a dip in the relative seconds/timestep as the model speeds up. If any iceworlds appear we can sort out communication by private message on this board. |
Send message Joined: 12 Aug 09 Posts: 20 Credit: 3,063,648 RAC: 0 |
The four AMD models are as follows: hadsm3dhet2_ul4a_006479812_6 .96 s/TS @ 65.16% complete. hadsm3fub_kh70_006479462_0 .99 s/TS @ 59.37% complete. hadsm3fub_kgzi_006479192_2 .96 s/TS @ 55.53% complete. hadsm3fub_kgxv_006479133_9 .97 s/TS @ 52.66% complete. I also have a 4850 x 2 using .05% CPU / core to crunch GPU Milkyway WU\'s. I have changed my setting to 1% of GPU for grapics, so should not slow things too much. Any idea how much the dip would be? |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
The four AMD models are as follows:OK, they\'re the ones now at http://www.bridge-9.org.uk/temp/dg/6697305.html etc. The list will be updated as models finish. I\'m rather busy for a couple of weeks, but eventually the script will be changed to create the pages from a host id, then no human intervention will be required. :-) Any idea how much the dip would be?There\'s an AMD example way back in this thread, here - second graph. |
Send message Joined: 12 Aug 09 Posts: 20 Credit: 3,063,648 RAC: 0 |
OK, they\'re the ones now at http://www.bridge-9.org.uk/temp/dg/6697305.html etc. The list will be updated as models finish. I\'m rather busy for a couple of weeks, but eventually the script will be changed to create the pages from a host id, then no human intervention will be required. :-) OK then. That\'s easy, I\'ll just check the Wu\'s once a day and look for dips. Looking at your graph, the bluey seems to only affect two trickles, so if I find any, I will copy the relevant TS\'s to a Iceworld directory, post here, and await further instructions. PS: I could not resist checking the \'Disk Tab\' bug with this version of BOINC and things kept working fine (12.66GB on that i7 so far). I also checked an .tmp directory after a phase change and that model is working fine as well. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
David, The Intel models have been added to the set of graphs now. Forewarned is normally forearmed - however, with two i7 machines the chance of anyone being ahead of you in those work units is slim. Still, you\'ll know to check the machine if the seconds/timestep heads skywards. Iain |
Send message Joined: 12 Aug 09 Posts: 20 Credit: 3,063,648 RAC: 0 |
David, Thanks for that Iain, I was considering asking for this, but decided you would think I was being lazy. hadsm3fub_kcb5_006473131 shows a spike, but that was due to a reset to the beginning of phase two after a power cut. My bad, too many computers off one surge protector. lol. We had another power cut this morning and another, hadsm3fub_kgvz_006479065_9, reset to start, but that spike has not shown up yet. I also noticed all the recordings stopped and had to be restarted for all WU\'s. I am, actually, following for three of the units. David |
Send message Joined: 15 Mar 06 Posts: 41 Credit: 3,581,078 RAC: 0 |
After failing to deliver one (2 weeks ago) which carried on to completion after restore, now have another iceworld. This one is the genuine article (I think!) at http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10317957 The .CPDN file is on its way. Note that the file size dropped from 96k (usual range 95 - 100k) to one at 69k then immediately 7k for a number, 3k for another series, then 2k before I killed it. Is that dwindling file size normal in the aftermath of an iceball? Enjoy. ;) |
Send message Joined: 12 Aug 09 Posts: 20 Credit: 3,063,648 RAC: 0 |
Slab: hadsm3fub_kf5n_006476821 has developed into an Iceworld at 86.170%. WU 6694993 Recording is on. CPDN files show 4:11PM 116KB 4:11PM 85KB 4:12PM 10KB 4:13PM 10KB 4:14PM 11KB 4:15PM 10KB 4:17PM 4KB Iceworld detector shows: Phase 3, Step 118,822, Trickle 59, First TS 1.31, Last TS 1.30, Ratio 1.0. Model is still running, please advise. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
iansm wrote: The .CPDN file is on its way. Note that the file size dropped from 96k (usual range 95 - 100k) to one at 69k then Yes. The information in the file relates to colors. It goes from a lot of colors (variation of temperatures) for a normal model, to one color for an iceworld. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
iansm wrote: ... and the in-built file compression exploits the redundant repetition in areas of the same value (temperature, pressure, precipitation and cloud cover) - so the files get smaller. The reason for the progressive reduction in file size is that the model initially fails at a single grid point and that failure spreads to the whole grid in two timesteps (in your case 0:95-100k, 1:69k, 2:7k). The drop to the final value, quite a number of steps later, results from sea ice becoming uniform over the whole ocean: more repetition, more redundancy, more compression. After that, nothing happens. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
After failing to deliver one (2 weeks ago) which carried on to completion after restore, now have another iceworld. Thanks for that, Ian. Once the e-mail had been coaxed past various spam filters, it was then processed into point #19 on the West coast iceworld collection - it seems a popular spot. The model froze at 184,334 in the third phase, which follows the pattern of all other crashes, whatever phase, whatever platform - i.e. the freeze occurs in the second timestep of a block of six. The significance of that? I haven\'t a clue. |
Send message Joined: 4 Oct 09 Posts: 73 Credit: 7,242,427 RAC: 0 |
Thanks for explanation.
Interesting. But you\'ll crack it sometime! |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
David Glogau\'s model has come across nicely and establishes a new freeze point, north-east of the Canary Islands. This shows that there is still considerable value in submitting Windows/Intel iceworlds, even though most of them do seem to pile up in the same place. (A Mac or Linux/AMD iceworld would nonetheless be of great interest because it would be the first to be looked at in this way, and would show whether fast-processing anomalies on those platforms have the same cause as iceworlds on Windows/Intel/AMD.) |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
|
Send message Joined: 24 Sep 05 Posts: 7 Credit: 3,467,957 RAC: 2,713 |
Looks like I got also one. hadsm3fub_keom_006431824_1 using hadsm3 version 607 On temperature graphic it shows a blue world and it looks like it needs much more then the 1.8 seconds for one timestep. Temperature is -36 or -42 Precip is 0 Presure is 950 resultid=9940773 By now I\'m at Timestep 102366 of 259248 - Phase 3 of 3 Matthias |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
By now I\'m at Timestep 102366 of 259248 - Phase 3 of 3 Matthias, Welcome to the CPDN message board. From what you say, it does seem to be an iceworld. The rate of progress has slowed dramatically and, since the model is in the final phase, it will not recover. If this is your first iceworld and you have a backup, then you could restore the backup to see whether the model freezes again at the same place: they usually do. Otherwise, my advice is to abort the model and download another that will then progress at normal speed. Iain |
Send message Joined: 24 Sep 05 Posts: 7 Credit: 3,467,957 RAC: 2,713 |
By now I\'m at Timestep 102366 of 259248 - Phase 3 of 3 Iain, There is no backup of an older state, so I\'ll abort this one an try a new one. Thanks for the fast answer and the welcome. I made a copy of the model files, so if you need some feel free to contact me. Matthias |
Send message Joined: 31 Aug 04 Posts: 18 Credit: 13,882,347 RAC: 0 |
My graphics are showing totally blue on Hadsm3mh-kw93 006490731-3 I\'m using an Intel Q6600. Timestep is 254245 of 259248 on Phase 1 S/Ts of 2.41 |
Send message Joined: 5 Aug 04 Posts: 11 Credit: 2,356,953 RAC: 0 |
Iain, I\'ve likely got another iceworld for you - hadsm3fub_kbz7_006472701 went blue somewhen before 35.5% complete so I\'ve wound it back a ways (currently at just beyond 34%) and I\'m re-running with recording switched on. It will probably be a day or so until it hits the blue wall again (I didn\'t catch the exact point first time around) but a note of the email address to which to send the \'.cpdn\' file would faciltate a speedy upload of the appropriate file. Cheers Dave |
Send message Joined: 5 Aug 04 Posts: 11 Credit: 2,356,953 RAC: 0 |
Iain, Update: Hah, caught it at timestep 11577 - I even had the graphics turned on just at the point it tripped over so was able to watch it go competely blue over a couple of timesteps. Just to confirm it, I re-ran the last few timesteps (was able to switch of the model before it did a checkpoint) and it froze at the same t/s three times straight. Seemed to spread from the US west coast as per others mentioned above. Sorry, this is all a bit sad but it\'s the first time in years I\'ve caught one blue-handed (so to speak)! Ready if/when you are Dave |
©2024 cpdn.org