climateprediction.net (CPDN) home page
Thread '155,553 days to completion.....advice needed please?'

Thread '155,553 days to completion.....advice needed please?'

Message boards : Number crunching : 155,553 days to completion.....advice needed please?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14402 - Posted: 15 Jul 2005, 15:20:44 UTC
Last modified: 15 Jul 2005, 15:23:23 UTC

Hi
Hmmmm.....I started phase 3 I think of my first model and boincview projected that huge number of 155,553 days to completion. P3 Linux 3.0 HT. After an hour or so this has reduced to 1967 days but seems to be stuck about there at 228 s/TS. Of course a missed deadline is being notified to me. What do I do if it does not reduce to something within timescale? Shame to dump it but equally if it will not be accepted why waste more CPU time on it? Advice please?

On this basis my little ole laptop running another model will finish just after the earth ends I think.

Thanks.
ID: 14402 · Report as offensive     Reply Quote
ProfileAndrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 14403 - Posted: 15 Jul 2005, 15:35:56 UTC - in response to Message 14402.  

Are you able to view the graphics? Does it all look normal, or is the Earth turning to ice or hot desert?
ID: 14403 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 14404 - Posted: 15 Jul 2005, 15:36:44 UTC - in response to Message 14402.  

Hmmm. Typically when this happens, some hiccup (a rather serious hiccup) occurred where the model restarted at TS 0 of phase 1, but did not reset the runtime, i.e. it kept the amount of time run, but went back to the start of the model.

I guess the thing to do is actually see via the visualization what phase you are in. If you are in phase 3, continue running and it will finish in a reasonable amount of time, like the other phases. If you are indeed at the beginning of phase 1, you have a more difficult decision since if you continue running it, you will get no additional credits until you get into phase 3. If this is not acceptable, you can do a project reset and get a new model.

No matter how you look at it, the estimated time to completion is just way off and inaccurate because of this hiccup.
ID: 14404 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14405 - Posted: 15 Jul 2005, 15:42:14 UTC

Ah thanks. I do not have a visual means of seeing whats happening. The number of days has a reduced a little further but not much. I guess I hang on a while and see.....but will dump it if necessary. Shame though if I do have to do that after nearly 15 days of CPU time. Thanks for coming back to me though.
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14405 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14410 - Posted: 15 Jul 2005, 17:25:57 UTC

I do have restart.end1 and .end2 files in datain though. Presumeably that gives a clue to the phase?
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14410 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 14411 - Posted: 15 Jul 2005, 17:47:27 UTC - in response to Message 14410.  

&gt; I do have restart.end1 and .end2 files in datain though. Presumeably that
&gt; gives a clue to the phase?

Not really. As you can't look at the graphics display you'll need to check the file 076i_000014237.xml in your projects/climateprediction.net directory. The values in the <b>&lt;PH&gt;</b> and <b>&lt;TS&gt;</b> tags contain the phase and timestep at the last checkpoint.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 14411 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14412 - Posted: 15 Jul 2005, 18:01:44 UTC

Ah. Well they say
3
5329
So that means I am in phase 3 then yes?
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14412 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 14414 - Posted: 15 Jul 2005, 18:16:41 UTC - in response to Message 14410.  
Last modified: 15 Jul 2005, 18:17:50 UTC

&gt; I do have restart.end1 and .end2 files in datain though. Presumeably that
&gt; gives a clue to the phase?

Not necessarily. They could be left over from before the glitch. You can check what the last files written to are in the

BOINCfoldername/projects/climateprediction.net/experimentname/dataout directory

If some of the most recent files (by timestamp associated with the last write to the file) have a p1 or p2 preceding c10 in the filenames, then you are in the 3rd phase. If they have 11 or 12 preceding c10, you are in the 1st phase.

Edit...looks like Thyme gave you an easier way to check.
ID: 14414 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14416 - Posted: 15 Jul 2005, 18:34:05 UTC - in response to Message 14414.  
Last modified: 15 Jul 2005, 18:37:46 UTC

&gt; &gt; I do have restart.end1 and .end2 files in datain though. Presumeably
&gt; that
&gt; &gt; gives a clue to the phase?
&gt;
&gt; Not necessarily. They could be left over from before the glitch. You can
&gt; check what the last files written to are in the
&gt;
&gt; BOINCfoldername/projects/climateprediction.net/experimentname/dataout
&gt; directory
&gt;
&gt; If some of the most recent files (by timestamp associated with the last write
&gt; to the file) have a p1 or p2 preceding c10 in the filenames, then you are in
&gt; the 3rd phase. If they have 11 or 12 preceding c10, you are in the 1st
&gt; phase.
&gt;
&gt; Edit...looks like Thyme gave you an easier way to check.
&gt;

ah so I have the latest file as 074dca.php1c10. Hmmmmmm...i guess I am phase 3

Thanks all for your help much appreciated!
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14416 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14418 - Posted: 15 Jul 2005, 19:47:38 UTC

Hmmm. Has your computer got HT enabled? I am trying to work out why 9 of 10 last trickles (over 6 days) for that host were result 918683 then a switch to 918761.
_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14418 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 14427 - Posted: 15 Jul 2005, 23:20:02 UTC - in response to Message 14418.  

&gt; Hmmm. Has your computer got HT enabled? I am trying to work out why 9 of 10
&gt; last trickles (over 6 days) for that host were result 918683 then a switch to
&gt; 918761.

Registered as a P4 with 2 CPUs. 918761 has trickled 8 times since being downloaded on 6th June, so I guess 918683 is pretty much locked into 1 of the virtual CPUs and 918761 is swapping with other projects on the other one.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 14427 · Report as offensive     Reply Quote
EclipseHA

Send message
Joined: 28 Aug 04
Posts: 42
Credit: 1,443,857
RAC: 0
Message 14429 - Posted: 15 Jul 2005, 23:40:46 UTC
Last modified: 15 Jul 2005, 23:43:42 UTC

The Linux cruncher resets % done to 0 at the end of each phase. So when you start phase three you are at something like .01% done, but have consumed a bunch of CPU time (everything spent on phase 1 and 2), you'll have really outragous time.. It will slowly correct it self back to something in reason. (though is never quite "right"

With a CC in the 4.4x range, I'd bet that you'll do nothing but CP for the next few days..

This is a regular occurance with the 4.13 CP cruncher and there's no need to worry... I must have finished 5 WU's like this already.
ID: 14429 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14432 - Posted: 16 Jul 2005, 8:01:07 UTC - in response to Message 14418.  
Last modified: 16 Jul 2005, 8:06:50 UTC

&gt; Hmmm. Has your computer got HT enabled? I am trying to work out why 9 of 10
&gt; last trickles (over 6 days) for that host were result 918683 then a switch to
&gt; 918761.
&gt; _______________________________
&gt; Visit <a> href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page"&gt;BOINC
&gt; WIKI</a> for help
&gt;
&gt; And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the
&gt; news in one place.
&gt;

Yes 3.0 HT enabled and runs fine. Of the two models there one always gets lots of CPU and the other very little. I did wonder if there was CPU affinity; Anyway I got one model in phase 1 and the other in phase 3. This is a similar story to another box i have with win 2003 server. It too has one model well in advance of the other.
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14432 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14435 - Posted: 16 Jul 2005, 9:44:07 UTC
Last modified: 16 Jul 2005, 9:47:44 UTC

OK it has trickled now so it is getting CPU time. As others have said ingor the time to completion.

Or work it out yourself as (100 - percent done) * 2592.48 * sec/TS (2.45 in this case) /60 /60 = CPU hours to completion of phase.

(This formula is only for linux users where the percent done is of the phase not the whole model.)

Edit: That may well be too optimistic, it appears the models may have switched processors so you may need to use 3.2 rather than 2.45.
_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14435 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14436 - Posted: 16 Jul 2005, 10:06:23 UTC - in response to Message 14435.  

&gt; OK it has trickled now so it is getting CPU time. As others have said ingor
&gt; the time to completion.
&gt;
&gt; Or work it out yourself as (100 - percent done) * 2592.48 * sec/TS (2.45 in
&gt; this case) /60 /60 = CPU hours to completion of phase.
&gt;
&gt; (This formula is only for linux users where the percent done is of the phase
&gt; not the whole model.)
&gt;
&gt; Edit: That may well be too optimistic, it appears the models may have switched
&gt; processors so you may need to use 3.2 rather than 2.45.
&gt; _______________________________
&gt; Visit <a> href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page"&gt;BOINC
&gt; WIKI</a> for help
&gt;
&gt; And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the
&gt; news in one place.
&gt;

Ok thanks I will do that. Again thanks for your help and insights to this.
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14436 · Report as offensive     Reply Quote
EclipseHA

Send message
Joined: 28 Aug 04
Posts: 42
Credit: 1,443,857
RAC: 0
Message 14461 - Posted: 17 Jul 2005, 1:20:07 UTC

Anybody know when a cruncher that doesn't reset to 0% done for each phase (for linux) might be released?

Anybody with linux and the CC 4.43 could be experiencing very od behaviour until then (if running more than one project)
ID: 14461 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14466 - Posted: 17 Jul 2005, 8:58:53 UTC

No I am not aware of that....would be nice though.

On my problem though the secs/TS has reduced massively now over 36 hours or so. Now at 13.39 s/TS and model projected completion is 120 days - down from 155000. It still seems to be falling slowly.....

I say this here now just to let folk know there advice was good and so that others that might come along later with the same question have the benefit of the whole story as it were. Saves them asking in a panic like I did perhaps.

Again thanks to all you folks for helping me.
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14466 · Report as offensive     Reply Quote
Profileold_user5994

Send message
Joined: 31 Aug 04
Posts: 239
Credit: 2,933,299
RAC: 0
Message 14468 - Posted: 17 Jul 2005, 12:47:54 UTC

Chris,

Is this something that can be summarized in the Wiki's CPDN FAQ? I am still to messed up to distill what went on here...
<p>
<a href="http://boinc-doc.net/boinc-wiki/index.php"><b>BOINC-Wiki</b></a>
<img src="http://www.boincstats.com/stats/banner.php?cpid=a6477942e70ed39f669d1ff2ede05be8">
ID: 14468 · Report as offensive     Reply Quote
Profileold_user59948

Send message
Joined: 3 Mar 05
Posts: 76
Credit: 127,896
RAC: 0
Message 14471 - Posted: 17 Jul 2005, 13:46:08 UTC

Hmmmm. May be.

Its about projected finish times when new phases start. Like they start really big and then get smaller. Equally the secs per time step start big and grdually get more realistic. But as this was my first model I was not sure what was happening. If that's of interest pls help ur self.
<img src="http://www.boincsynergy.com/images/stats/comb-1091.jpg"></img><br><img src="http://www.iantighe.com/setisig.jpg"></img><img border="0" src="http://boinc.mundayweb.com/one/teamStats.php?userID=1602&amp;prj=1&amp;trans=off">
ID: 14471 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14472 - Posted: 17 Jul 2005, 14:32:21 UTC
Last modified: 17 Jul 2005, 14:34:36 UTC

I have put a section in the Climatprediction FAQ
<a href="http://boinc-doc.net/boinc-wiki/index.php?title=Climateprediction_FAQ#Why_have_I_got_a_crazy_time_to_completion.3F">crazy_time_to_completion</a>.

Also category headings done at long last. I would like to add something in the graphics compatibility section about ATI and radeon cards but I thought I would wait for the version number(s).

There is probably lots of other improvements that could be done. Let me know.
_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14472 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : 155,553 days to completion.....advice needed please?

©2024 cpdn.org