Message boards : Number crunching : HadSM3 progressing at snails pace
Message board moderation
Author | Message |
---|---|
Send message Joined: 23 Dec 05 Posts: 2 Credit: 254,498 RAC: 0 |
Hi there, I am running "UK Met Office HadSM3 Slab Model 6.07" in one of the cores of my Lenove C2Q/8200 machine. OS: WinXP/SP3 BOINC is running a service so I don't see any graphics. As per your advice in the forum I ran a complete intensive diagnostic of the PC using the Lenove Thinkvantage Toolbox and it came OK. Currently it runs the WU hadsm3dhet2_jutz_006602953_5. 50 hours ago it was at: Elapse time: 348:48:18 Progress: 89.149% To complete: 42:23:58 Now it is at: Elapse time: 397:59:18 Progress: 89.544% To complete: 46:25:03 In 49 hours of computation it advance 0.395%, if my calculation are right it will around 1300 hrs more to complete. Is this a normal behaviour for this application ? As it already computed for ~400 hours I will hate to abort it. I will appreciate any advice. Regards, Yair |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The sec/TS has jumped suddenly, so it's probably gone "ice world". I don't think that you have much choice: 1) Continue for, possibly, the rest of the year, with the chance of faulty results. 2) Abort it. As per the News thread, that model type has been retired, so only FAMOUS at the moment, (the Millennium model), with three varieties of a new type in beta testing. Backups: Here |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,860,147 RAC: 4,891 |
Yair, Welcome to CPDN message board. From the record of that computer it's apparent that you've successfully run all the slab (HADSM3) and mid-Holocene (HADSM3MH) models you've downloaded to completion - so the problem isn't likely to be the computer. The most likely explanation is that the model has become a slow-processing 'iceworld'; depending on model batch, up to 15% become iceworlds. The model will eventually complete, but the the rate of trickle submission might decline from, say, one every few hours to one every week. Unless the model is very close to the end, which yours is not, then aborting it is the only option. The computer can then get on with some more useful work - there's a FAMOUS model already downloaded on that machine ready to go. Iain PS If you use the message boards 'advanced search' facility to look for the word 'iceworld' over the last year, you'll find some other relevant threads. [Edit: Oops - Les got there first.] |
Send message Joined: 23 Dec 05 Posts: 2 Credit: 254,498 RAC: 0 |
Thanks Iain, I will probably abort it in a while. The WU famous_r141_1599_200_006666156_1 listed in my account as being in progress on my computer is not here, so you might want to make it available for download again. In my project preferences I allowed all experiments, I just checked and they are all still there, available, including the Hadsm3 that Les said is retired. Would you like me to select only few of the experiments ? Regards, Yair |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Dear Ubdaddy: Check the “server status†page. You will see that next to all model types except the “famous†model the number ready to send is 0. Just think, now that the SM’s are retired we will no longer have these interesting discussions about “ice worldsâ€. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
As Les said, the model's speed, or sec/timestep, has jumped dramatically. Here's the model. The situation is far worse than the impression given by the last value which is a cumulative average. Another computer in the same workunit has completed the model without problems, but it has Linux whereas you have Windows. If an iceworld develops it's almost always on every model in a workunit with the same operating system. There's another computer in the WU that, like yours Ubdaddy, also has Intel + Windows. This computer's model is less advanced than yours but will probably become an iceworld at exactly the same processing moment. We'll ask our programmer Milo to send the owner an email to warn him. Don't let your computer waste more time on this model. If you let it battle on for weeks and weeks, data will probably be missing from its graphs from the moment when the iceworld developed. Abort it. Now. Thank you for reporting the problem. Cpdn news |
Send message Joined: 23 Jun 10 Posts: 2 Credit: 13,893 RAC: 0 |
Hello. I found this post searching for the first several characters of the work unit I'm questioning, which I am currently running, and wondered if maybe theres a correlation to my problem? Here's the info of the unit I have a question about: 7/8/2010 7:56:33 AM climateprediction.net Restarting task hadsm3dhet2_js14_006599322_2 using hadsm3 version 607 When I'm running for a while, I would return to the system showing a black screen with the taskbar showing on the bottom, and several instances of this unit showing, with the windows comment "not responding". I'd also noticed that lately when I would first see the screensaver graphic, the globe was peculiarly without any atmosphere, yet having what seemed like low-lying fog. Is that what you mean by the calculations turning into an iceworld? Is that whats causing the hang on the screensaver graphics? I didnt want to just close them as is, and each time have resorted to full system restarts; I'm hoping that thats the safest manner to handle the number crunching without data loss? ...a graceful shutdown/restart? Thanks. And, please bear with me, I've just begun participating, and this is my first post. -DP |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Zerepelad, welcome to the forum. Don't worry about being a newbie; if more people posted when they're not sure what's going on, more problems would get sorted out. Here's the web page for hadsm3dhet2_js14_006599322_2. I see no sign that it's become an iceworld. There's a description by Geophi of what 'iceworlds' are like here. Your computer has AMD and Windows; with this combination, if the model did turn into an iceworld you would expect to see the processing suddenly speed up, not slow down. The temperature view of the model's globe would show a complete blue circle. The other views eg pressure would also show just one colour, not the usual moving picture of the weather as it's produced. So if you look at the globe graphics from time to time you'll see whether your model is still progressing normally. It's getting near the end of Phase 1 of the 3 phases. At the end of each phase it will produce a file to upload. While it's post-processing this file and for 10 or 15 minutes afterwards try not to disturb the model by suspending it or exiting from Boinc. These HadSM models don't like their file-processing to be interrupted. You said: When I'm running for a while, I would return to the system showing a black screen with the taskbar showing on the bottom, and several instances of this unit showing, with the windows comment "not responding". I'm not sure what you mean by several instances showing. It would help if you could describe what happens in more detail please. Is your Boinc manager open when this happens? I'd also noticed that lately when I would first see the screensaver graphic, the globe was peculiarly without any atmosphere, yet having what seemed like low-lying fog. I think you may be seeing the Clouds view which can look rather foggy. If you're often getting this Windows 'nor responding' message it probably means that web pages are freezing. This happens to almost everybody from time to time but if it's a frequent problem it may be that your computer isn't very happy running the screensaver which you probably have set up to kick in when the computer's been left for a while. The screensaver graphics are rather resource-intensive because they're dynamic, constantly changing; for this reason they slow down the processing of the model. A computer of mine often froze until I disabled the screensaver. To disable the screensaver: Right-click on a blue area of the desktop Select Properties in the menu In the Display Properties pane choose the Screensaver tab In the Screensaver drop-down menu select None or a static picture Click the OK button You can still view your globe whenever you want by clicking on the View Graphics button in your Boinc manager. Because the globe window viewed this way isn't full-screen like the screensaver, computers don't seem to mind it. Still not a good idea to leave the globe window open all the time though. See whether that suggestion helps. Cpdn news |
Send message Joined: 23 Jun 10 Posts: 2 Credit: 13,893 RAC: 0 |
Thank you! |
©2024 cpdn.org