climateprediction.net home page
sec/tstep increase

sec/tstep increase

Message boards : Number crunching : sec/tstep increase
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile old_user16598

Send message
Joined: 11 Sep 04
Posts: 9
Credit: 321,368
RAC: 0
Message 6791 - Posted: 9 Dec 2004, 4:57:12 UTC
Last modified: 9 Dec 2004, 4:57:30 UTC

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_user.php?userid=16598

My Athlon 64 system has been cranking out results for many months at around 2.00 to 2.10 sec/ts. Today I am now seeing it jumped up to 30 sec/ts then 16 then 11. What would cause this? Should I be worried?

Keith
ID: 6791 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2185
Credit: 64,822,615
RAC: 5,275
Message 6792 - Posted: 9 Dec 2004, 5:29:00 UTC - in response to Message 6791.  
Last modified: 9 Dec 2004, 5:29:40 UTC

> http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_user.php?userid=16598
>
> My Athlon 64 system has been cranking out results for many months at around
> 2.00 to 2.10 sec/ts. Today I am now seeing it jumped up to 30 sec/ts then 16
> then 11. What would cause this? Should I be worried?
>
> Keith
>
Looking at the trickles at:

<a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=29407">http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=29407</a>

It looks like one model (407692) bombed out with an error after 7 trickles, it downloaded another one, yet somehow internally kept the start time as that of when you started 407692. So its sec/ts calculation is screwed up. It shouldn't be a problem.
ID: 6792 · Report as offensive     Reply Quote
Profile old_user16598

Send message
Joined: 11 Sep 04
Posts: 9
Credit: 321,368
RAC: 0
Message 6797 - Posted: 9 Dec 2004, 8:28:44 UTC

What would cause it to "Bomb out"? Hardware problem or just a software error?
ID: 6797 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2185
Credit: 64,822,615
RAC: 5,275
Message 6803 - Posted: 9 Dec 2004, 13:39:37 UTC - in response to Message 6797.  

&gt; What would cause it to "Bomb out"? Hardware problem or just a software error?
&gt;
Difficult to say, but a possible hardware problem (i.e. proc running too hot, or not enough voltage, or RAM running on too tight timings, or not enough RAM voltage). Are you overclocking? Looks like that PC has completed two runs and errored on three others.
ID: 6803 · Report as offensive     Reply Quote
Profile old_user16598

Send message
Joined: 11 Sep 04
Posts: 9
Credit: 321,368
RAC: 0
Message 6816 - Posted: 9 Dec 2004, 20:04:09 UTC - in response to Message 6803.  

That one is slightly overclocked but I ran Prime 95 without errors and haven't had any strange behavior from it. The funny thing is that it looks like very machine I have, most of which aren't overclocked at all, have lots of client errors.

Keith
ID: 6816 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2185
Credit: 64,822,615
RAC: 5,275
Message 6824 - Posted: 9 Dec 2004, 21:31:10 UTC - in response to Message 6816.  

&gt; That one is slightly overclocked but I ran Prime 95 without errors and haven't
&gt; had any strange behavior from it. The funny thing is that it looks like very
&gt; machine I have, most of which aren't overclocked at all, have lots of client
&gt; errors.
&gt;
&gt; Keith
&gt;
If you've done the basic hardware maintenance from this thread

http://www.climateprediction.net/board/viewtopic.php?t=2124

and hardware specific tests from this thread

http://www.climateprediction.net/board/viewtopic.php?t=2126

successfully, then I don't know. Other problems can come about if not suspending and exiting BOINC before rebooting, although they don't seem to be the -5 errors.
ID: 6824 · Report as offensive     Reply Quote
Profile old_user16598

Send message
Joined: 11 Sep 04
Posts: 9
Credit: 321,368
RAC: 0
Message 6829 - Posted: 9 Dec 2004, 23:25:39 UTC - in response to Message 6824.  

It looks like the WU that had errors on mine also had errors on other computer when they were sent to them. Could it be that the WU was bad or flawed? Or that BOINC was having a problem or glitch?

Keith
ID: 6829 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2185
Credit: 64,822,615
RAC: 5,275
Message 6832 - Posted: 10 Dec 2004, 3:43:34 UTC - in response to Message 6829.  

&gt; It looks like the WU that had errors on mine also had errors on other computer
&gt; when they were sent to them. Could it be that the WU was bad or flawed? Or
&gt; that BOINC was having a problem or glitch?
&gt;
&gt; Keith
&gt;
It's possible. On the other hand, On two work units that I got on my PCs that were sent out in early October, each to 3 other PCs. Only one other was returned besides mine, i.e. 3 out of 8 were successful, the other 5 returned errors. I went through checking about 300 models sent out in early October. Each of these were sent to 4 different PCs. About half didn't have anyone complete them, about 3/8ths had only one person complete it (although some still had other PCs crunching on the work unit), and about 1/8 had more than one person complete the model. Of those, only 2 were completed by 3 people, and none completed by all 4. So while it's possible that the models would go unstable due to different parameter sets, the greater likelihood is that many people are having hardware or software configuration problems.
ID: 6832 · Report as offensive     Reply Quote
Profile old_user16598

Send message
Joined: 11 Sep 04
Posts: 9
Credit: 321,368
RAC: 0
Message 6833 - Posted: 10 Dec 2004, 4:30:12 UTC - in response to Message 6832.  

Well, I generally run Prime 95 torture test for about 4 hours then consider it stable. I also play alot of games and figure that if they don't crash or have problems then I am most likely ok. I am by nature though the type of person that is driven crazy by thinking my system might be causing the errors. Guess I shouldn't be overclocking. :) Anyway I am going to try running Prime 95 for 24 hours on my other system that has had lots of Client Errors and see what that shows.

Keith
ID: 6833 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 6836 - Posted: 10 Dec 2004, 7:59:03 UTC

Worriers will love(?) UK_Nick's testing suggestions at the top of the page <a href="http://www.climateprediction.net/board/viewforum.php?f=23">here</a> on the community forum, though it was written for CPDN classic so there are no references to BOINC.
ID: 6836 · Report as offensive     Reply Quote
Profile old_user156
Avatar

Send message
Joined: 5 Aug 04
Posts: 186
Credit: 1,612,182
RAC: 0
Message 6878 - Posted: 12 Dec 2004, 7:07:15 UTC
Last modified: 12 Dec 2004, 7:07:24 UTC

I'm currently following my own advice with 'Dilly' - she's upchucked a whole bunch of models these last few days - so she's running her latest model along with Prime95 at priority 4 thus they're both using 50% of the CPU...

<a href="http://www.nmvs.dsl.pipex.com/"><img src="http://boinc.mundayweb.com/cpdn/stats.php?userID=6&amp;team=off&amp;trans=off"></a>

<a href="http://www.nmvs.dsl.pipex.com/">Distributed Mania</a>
ID: 6878 · Report as offensive     Reply Quote

Message boards : Number crunching : sec/tstep increase

©2024 cpdn.org