climateprediction.net (CPDN) home page
Thread 'Abnormally slow crunching of HADSM'

Thread 'Abnormally slow crunching of HADSM'

Message boards : Number crunching : Abnormally slow crunching of HADSM
Message board moderation

To post messages, you must log in.

AuthorMessage
Brave Daun

Send message
Joined: 4 Nov 07
Posts: 3
Credit: 3,203,725
RAC: 9,189
Message 34375 - Posted: 23 Jul 2008, 23:45:14 UTC

I am currently running two Iceworlds (completely blue) on my stock Intel Duo 2Ghz HP laptop (no overclocking).

I\'ve been running hadsm3fub_jm06_005946649 (taskid 7436324) for 1097 hours (60 s/TS) @ timestep 65000, reporting 8% complete but the hours to complete is 1319 and steadily increasing.
Also: hadsm3fub_ink9_005948668_2 (taskID 7456509) for 529 hours (59s/TS) timestep 31820, reporting 4% complete with 851 hours remaining (and climbing).

I was running these along with other projects, but now have been running them solidly for a couple of weeks and the time to completion still climbs. So, I came here to discover that I should probably just kill them?

Please advise.

- Brave Daun
ID: 34375 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34387 - Posted: 24 Jul 2008, 20:13:44 UTC

Hi Brave Daun, welcome to the forum. Thanks for posting the links.

I\'ve looked at both those slab models and the workunits they belong to. Although your model graphics are monchrome which is a bad sign, a couple of things give me the impression that these aren\'t cases of typical iceworld slowdowns.

* In the case of \'iceworlds\' the sec/timestep would be much slower even than what you\'re seeing.

* In the case of one model, no computer in the same workunit has finished the model, but at least one person has crunched perfectly normally past the point you\'re at and their graphs are normal. This suggests that the model isn\'t at fault.

* The other model has already been completed by another computer at normal speed, producing normal graphs. So I don\'t think that model\'s defective either.

I suspect there may be a hardware problem causing this super-slow and unstable crunching. I\'m going to move your post to a thread of its own so you receive undivided attention. I\'ll leave your problem to be picked up by one of the crunchers who know about hardware and improving stability, which I don\'t.

In the meantime, in case the laptop\'s overheating, make sure you have its little feet extended and put something else under it at both sides eg the edge of a book at each side to raise the entire laptop off the tabletop and increase the airflow.

In BOINC Manager Tasks tab, suspend one of the models so only one core\'s running. This will help cool the machine.
Cpdn news
ID: 34387 · Report as offensive     Reply Quote
Brave Daun

Send message
Joined: 4 Nov 07
Posts: 3
Credit: 3,203,725
RAC: 9,189
Message 34390 - Posted: 25 Jul 2008, 1:24:05 UTC - in response to Message 34387.  

Thanks, Who knew I was so interesting.
The computer doesn\'t seem to have any temperature problems - it is in a very clean and cool environment (so the vents are clear).
I have not had problems with other projects, I guess the amount of disk space used is a little unusual - it is only 333.43 MB for both tasks total. Usually CPDN uses over a Gig on my other machines (I currently have 2.4Gb free, but I have had much less).

- Any ideas of what I should do with these?
ID: 34390 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 34392 - Posted: 25 Jul 2008, 3:34:40 UTC
Last modified: 25 Jul 2008, 3:35:45 UTC

mo said:
* In the case of \'iceworlds\' the sec/timestep would be much slower even than what you\'re seeing.

It will vary based on how far into the model the run was when it went iceworld. Early on, sec/TS will climb quickly. Late into third phase, sec/TS will climb slower.

I think they are definitely both iceworlds and you should abort them. You are doing the right thing with the laptop in keeping the inside clean and the vents clear. However, since others running the same models have gotten farther, as Mo said, it suggests some kind of hardware issue. If you get any more errors, or have other stability problems with other cpdn models, you may have to run just one at a time.
ID: 34392 · Report as offensive     Reply Quote
Brave Daun

Send message
Joined: 4 Nov 07
Posts: 3
Credit: 3,203,725
RAC: 9,189
Message 34410 - Posted: 28 Jul 2008, 2:47:51 UTC

Done, They are toast.

I don\'t run this project often on this machine, but I like to run it when I am going to be off-line for a long time, I will just download one next time.

Thanks,
- Brave Daun
ID: 34410 · Report as offensive     Reply Quote

Message boards : Number crunching : Abnormally slow crunching of HADSM

©2024 cpdn.org