Questions and Answers : Windows : WU\'s stick for ages on the % done
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Aug 04 Posts: 11 Credit: 1,281,270 RAC: 0 |
I\'m running a couple of WU\'s on three different machines, and all WU\'s seem to stuck on the same % completion. When looking at the trickles send I see that since november no new trickles are received by CPDN. Howe come? Simmel |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Simmel So far I\'ve found one: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6818233 Look at the speed - the sec/timestep of the most recent trickle. That\'s a cumulative average, so the model\'s real speed slowed to almost no progress after the second-last timestep. The model has become a slow-processing \'iceball\'. There are threads about this problem with HADSM slab models in the forum Number Crunching section. If you leave the model it will eventually complete, but it isn\'t worth it because its results will almost certainly be abnormal. Look at the model\'s graphics; I expect they are also abnormal with all-blue \'temperatures\'. Don\'t waste any more computer time on it. Abort it. The other two models are probably also iceballs, but I\'ll look for them to check. You can easily check yourself by looking at their graphics. If you have 3 iceballs, that\'s very bad luck. It isn\'t your fault or your computers\' fault. Edit: You need to check the graphics of this HADSM slab. Maybe it\'s progressing so slowly that it can\'t make its next trickle yet: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6935131 New edit: I think you have 2 HADSM iceballs running together on the same computer: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7106657 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7085115 (this one became an iceball before its second trickle, possibly before its first trickle) If I haven\'t found every model you\'re worried about, could you please post a link. I hope there are no more of these. That\'s really bad luck. You\'ve crunched a lot for CPDN - thanks for your contribution. Cpdn news |
Send message Joined: 29 Aug 04 Posts: 11 Credit: 1,281,270 RAC: 0 |
Hi mo.v, Well I guess I\'m bad luck ... Checked my three crunchers: Host 1 (mtf-ams-srv-001) is running result 6818233 and 7040879. Both IceBalls ... Host 2 (mtf-ams-lt-104) is running 7085115 and 7106657. Both IceBalls ... Host 3 (wks02) is running 6935131 and 6962734. Onlt the last one is no IceBall ... So 5 out of 6 are iceballs ... Man ! Simmel |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Five out of six is the worst I\'ve ever come across! Although the more HadSM3s you run the more likely you are to encounter one which iceballs, and then ties up the PC. The other types of model aren\'t affected by iceballs. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Overclocked? |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Now that we have a definite list of these 5 iceballs, I\'m going to check those 5 complete shared workunits to see if I find any more. If I do I\'ll send private messages to the crunchers. Edit later: In those 5 workunits I\'ve found one iceball and will send a message to its owner. It belongs to the same workunit as Simmel\'s 6935131 on host 3. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6935132 Cpdn news |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
On host 844370, both results have been taken further on the same platform (Intel+Windows). There may be something wrong with that machine. Result 7040879 in work unit 6112139 has also been taken further by an Intel+Vista host. Again that suggests a PC problem as Windows variety is not relevant (?). The models themselves are identical and differences should appear only between platforms. |
Send message Joined: 29 Aug 04 Posts: 11 Credit: 1,281,270 RAC: 0 |
Hi Guys, None of the host I use are overclocked. I do not use any of those hard ware acceleration utils. Just the iron out-of-the-box and then the installation of boinc. I disconnected all stalling units and got me 5 new ones. Those are now running and proceeding as expected. Thanx for all the info. I will be more alert next time. Greetz, Simmel btw: I was hovering around rank 4200 in november. Need to fight the whole way back now :-( Simmel |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
There is a \'short cut\' to getting back to your old rac quickly, just run offline for about 15 or 16 days, and then let all your PCs talk to the network. Once the credit job runs (overnight), you\'ll then jump straight back up to your original rac. I'm a volunteer and my views are my own. News and Announcements and FAQ |
©2025 cpdn.org