climateprediction.net (CPDN) home page
Thread 'Long "Time to completion" after crash'

Thread 'Long "Time to completion" after crash'

Message boards : Number crunching : Long "Time to completion" after crash
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user2694

Send message
Joined: 29 Aug 04
Posts: 2
Credit: 455,787
RAC: 0
Message 9822 - Posted: 23 Feb 2005, 7:54:49 UTC
Last modified: 23 Feb 2005, 8:05:24 UTC

This has happened quite a few times on the same machine.

If something crashes the machine (had recent problems with a firewall + XPSP2) and BOINC is running, when the machine reboots and loads BOINC, one of the models says its going to take over 9000+ days to finish, this figure goes down over a couple of days (in a day it now shows 900+ days). The work does not trickle whereas the other one on the HT proc does.

Is it best to end this model rather than wait for it to finish as it currently sits outside the deadline date :(

It has had the same problem no matter which version of BOINC I use, currently 4.19.

System: Northwood 3.4C @ 3.7Ghz temp 41-47c, HT enabled, 1GB, watercooled.
ID: 9822 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 9824 - Posted: 23 Feb 2005, 8:22:58 UTC
Last modified: 23 Feb 2005, 8:24:52 UTC

Hi DarlyB, welcome to the forum.

Whether to continue may well be decided by the stability of the computer.
Have you run the usual tests?
<a href="http://www.mersenne.org/freesoft.htm">Prime95</a>
<a href="http://www.memtest86.com/">Memtest</a>
<a href="http://www.geocities.com/hjsmithh/Pi/Super_Pi.html">SuperPi</a>

Over clocking leads to problems with CPDN, which is pushing the cpu to the limit as it is.
The long finish time might be caused by BOINC getting confused. It's a bit buggy in several areas.

&gt; does not trickle
Do you mean that it doesn't create trickle files in the project directory, or that they don't show up on
your Accounts page? The latter may just be the server waiting until you catch up to where you were before the crash.

Impressive machine, though.
I've toyed with the idea of pushing my 3.2 Northwood a bit, but it's rock solid as it is, power failures and all.
So I've decided to remain a 'tortise', and not try to be a 'hare'.

Les
ID: 9824 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 9825 - Posted: 23 Feb 2005, 8:27:50 UTC
Last modified: 23 Feb 2005, 8:29:57 UTC

Sounds like the model rewound to the beginning. The 9000+days is a culmulative calculation ie you have taken a fair while to get to somewhere near the beginning. You are probably crunching at the same rate as before and the 900 days will continue to fall.

When you say the work does not trickle, I would expent the trickles to occur but not get credit until you pass the point you were at before. Perhaps you could clarify and/or give the resultid.

There are possible other reasons for not trickling at the moment, due to problems with uploader1.atm.ox but I do not know whether that is involved.

If you have a backup from before the model rewound itself that could be useful.

Is it your host 108516? 2 out 17 completed runs is not ideal and I see you are overclocking but have got a nice low temperature with watercooling. Presumably you have tried stability tests such as mentioned in <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=1998">this thread</a>

Edit: Les beat me to it.
Visit BOINC WIKI for help

And join BOINC Synergy for all the news in one place.
ID: 9825 · Report as offensive     Reply Quote
old_user2694

Send message
Joined: 29 Aug 04
Posts: 2
Credit: 455,787
RAC: 0
Message 9827 - Posted: 23 Feb 2005, 8:32:48 UTC - in response to Message 9824.  

&gt; Whether to continue may well be decided by the stability of the computer.
&gt; Have you run the usual tests?
&gt; <a href="http://www.mersenne.org/freesoft.htm">Prime95</a>
&gt; <a href="http://www.memtest86.com/">Memtest</a>
&gt; <a href="http://www.geocities.com/hjsmithh/Pi/Super_Pi.html">SuperPi</a>

All tests completed several times over, machine is stable esp now it is powered with a tagan 480w psu :) Was originally on a 300w - not recommended with that P4, melted a molex connector! Crashing was just down to software incompatabilities.

&gt; The long finish time might be caused by BOINC getting confused. It's a bit
&gt; buggy in several areas.
&gt;
&gt; &gt; does not trickle
&gt; Do you mean that it doesn't create trickle files in the project directory, or
&gt; that they don't show up on
&gt; your Accounts page? The latter may just be the server waiting until you catch
&gt; up to where you were before the crash.

Nothing appearing on the stats page but will check the machine when I get home as my wife does not like me remote controlling the pc while she is on it :)

&gt; Impressive machine, though.
&gt; I've toyed with the idea of pushing my 3.2 Northwood a bit, but it's rock
&gt; solid as it is, power failures and all.
&gt; So I've decided to remain a 'tortise', and not try to be a 'hare'.

I did have a 3Ghz until I found Intel were about to stop making northwoods so managed to find one of the last 3.4Ghz chips in stock in the UK - mind you paid a price for it :(

DarlyB
ID: 9827 · Report as offensive     Reply Quote

Message boards : Number crunching : Long "Time to completion" after crash

©2024 cpdn.org