climateprediction.net (CPDN) home page
Thread '4.12 stuck at 0%'

Thread '4.12 stuck at 0%'

Questions and Answers : Windows : 4.12 stuck at 0%
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 14135 - Posted: 5 Jul 2005, 21:16:19 UTC
Last modified: 5 Jul 2005, 21:50:03 UTC

I added one PC to this project yesterday, it downloaded 4.12 and a model.

After running more than a day (unattended), doing a fresh benchmark, restarting CPDN it was still at 0.00% on CPDN and the Einstein WUs didn't get any further either.

I attached a monitor and saw : The CPDN program had not counted any CPU time, it sat there doing nothing, BOINC showed "running" but the % didn't move at all.


Result is <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=976576">976576</a>, this "AMD unknown CPU" is a mobile Athlon XP (not a laptop though). BOINC was running iconized.


It could easily be fixed with a BOINC restart - but I thought I'd report it for your collection ;-)
_______________

edit : BOINC was running, that I could see over the LAN as client_state.xml always had the current date/time

Here's what the last lines of stdout say :

2005-07-04 20:35:20 [Einstein@Home] Pausing result l1_0267.5__0267.7_0.1_T04_S4lA_0 (left in memory)
2005-07-04 20:35:20 [climateprediction.net] Starting result 172y_100076415_0 using hadsm3 version 4.12
2005-07-05 09:55:15 [---] Running CPU benchmarks
2005-07-05 09:55:15 [---] Suspending computation and network activity - running CPU benchmarks
2005-07-05 09:56:17 [---] Benchmark results:
2005-07-05 09:56:17 [---] Number of CPUs: 1
2005-07-05 09:56:17 [---] 2002 double precision MIPS (Whetstone) per CPU
2005-07-05 09:56:17 [---] 4742 integer MIPS (Dhrystone) per CPU
2005-07-05 09:56:17 [---] Finished CPU benchmarks
2005-07-05 09:56:18 [---] Resuming computation and network activity

The time when I recognized the problem was about 23:00 so there has been no stdout entry for about 13 hours and no CPDN CPU activity for more than 24 hours.

stderr was empty
ID: 14135 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 14158 - Posted: 6 Jul 2005, 18:09:40 UTC
Last modified: 6 Jul 2005, 18:48:05 UTC

Strange, it keep doing that. Everytime it switches from Einstein to CPDN, it stops working. After a restart it runs CPDN without a problem.

Only this one PC is affected, The other PCs with the same settings and the same mix of Einstein and CPDN work.


edit :I think I found something, a system message in the message log of Win2k, that looks as if there was a memory violation, protection fault or something like that. When it happens next time, I will try to find out more and report.


edit2: I found the log, it was hadsm3um_4.12_w.exe

Here's the <a href="http://oct31.de/tmp/.drwatson.txt">logfile</a>


edit3 : I reduced speed a little (this machine is OC'ed), changed some system settings and removed a few services that aren't vital ... let's see what happens :-)
ID: 14158 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 14253 - Posted: 9 Jul 2005, 19:30:31 UTC

Did reducing the speed fix your problem Ananas? I notice from your Dr Watson log that the crash happened on 06.07.2005 @ 12:21:12. How does that relate to the messages in stdout?
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 14253 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 14256 - Posted: 9 Jul 2005, 21:03:28 UTC - in response to Message 14253.  
Last modified: 9 Jul 2005, 21:05:19 UTC

&gt; Did reducing the speed fix your problem Ananas? I notice from your Dr Watson
&gt; log that the crash happened on 06.07.2005 @ 12:21:12. How does that relate to
&gt; the messages in stdout?
&gt; ...


It was not necessarily the same crash, it did the same thing twice after switching from Einstein to CPDN. The DrWatson log contained both crashes, they have been basically identical though.

When CPDN started, it ran for a while, then it switched to Einstein. The crash happened when it went back from Einstein to CPDN. Maybe I could have switched off "leave in memory" but CPDN + Einstein + BOINC fit into 256MB easily without swapping.

Yes, reducing the FSB speed (and core voltage) slightly did fix the problem, no crashes anymore since. So CPDN is really a bit more critical on overclocked computers.

I wonder if <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2855">this thread</a> is related, it sounds very similar.
ID: 14256 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 14268 - Posted: 11 Jul 2005, 13:50:38 UTC - in response to Message 14256.  

&gt; I wonder if <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2855">this thread</a> is related, it sounds very similar.

That was my initial thought, but then I remembered you're running BOINC 4.19. The problem of CPDN not running after an aborted benchmark seems to be specific to 4.44 and 4.45.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 14268 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 14269 - Posted: 11 Jul 2005, 15:33:38 UTC - in response to Message 14268.  
Last modified: 11 Jul 2005, 15:35:21 UTC

&gt; &gt; I wonder if <a> href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2855"&gt;this
&gt; thread</a> is related, it sounds very similar.
&gt;
&gt; That was my initial thought, but then I remembered you're running BOINC 4.19.
&gt; The problem of CPDN not running after an aborted benchmark seems to be
&gt; specific to 4.44 and 4.45.


The benchmark was not involved in the problem I reported, it was just coincidence that the benchmark ran just then. The crash has already happened several hours before the benchmark started so for me it was really the overclocking, probably together with the "local warming" in my computer room.

I didn't have Einstein and ALife validation problems with that computer so I assumed it would be stable. But as the computer is now crunching without problems, it seems always to be a good idea to stay a bit below the limits with CPDN hosts.
ID: 14269 · Report as offensive     Reply Quote

Questions and Answers : Windows : 4.12 stuck at 0%

©2025 cpdn.org