climateprediction.net (CPDN) home page
Thread 'Very weird WU.'

Thread 'Very weird WU.'

Message boards : Number crunching : Very weird WU.
Message board moderation

To post messages, you must log in.

AuthorMessage
EclipseHA

Send message
Joined: 28 Aug 04
Posts: 42
Credit: 1,443,857
RAC: 0
Message 12303 - Posted: 4 May 2005, 22:21:46 UTC

I have a WU that's 77.25% compete and still in phase 1.

It's "estimated time" was only 10 days when it started, while other WU's are normally 30 days for that machine.

The box is running redhat 2.4, and 4.13 of the cruncher.

Boinc version 4.27.

Anybody else seen one of these short ones?
ID: 12303 · Report as offensive     Reply Quote
ProfilePurple Rabbit
Avatar

Send message
Joined: 1 Sep 04
Posts: 23
Credit: 5,543,951
RAC: 2,317
Message 12305 - Posted: 5 May 2005, 2:56:28 UTC - in response to Message 12303.  

> The box is running redhat 2.4, and 4.13 of the cruncher.
>
> Boinc version 4.27.
>
> Anybody else seen one of these short ones?

I've seen this several times on BOINC 4.3, SUSE 9.3. It's not really a short WU. It looks like the initial estimated time is only for phase 1. It adjusts when phase 2 starts and the estimated time and s/TS are closer to normal.

I haven't seen this behavior in Windows so I assume it's a Linux thing.

Rick
ID: 12305 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 12306 - Posted: 5 May 2005, 2:58:00 UTC - in response to Message 12303.  

I have a few of those, like 48.7% complete with 10 Trickles and still in Phase 1.
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 12306 · Report as offensive     Reply Quote
deprecated
Avatar

Send message
Joined: 8 Aug 04
Posts: 21
Credit: 5,536,868
RAC: 2,725
Message 12329 - Posted: 6 May 2005, 4:53:21 UTC
Last modified: 6 May 2005, 4:59:28 UTC

Sounds very much like my scenario from <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2482&amp;PHPSESSID=e2e11cab561f92218b519073f9febbfe">here</a>


ID: 12329 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 12330 - Posted: 6 May 2005, 8:57:17 UTC - in response to Message 12329.  

I've looked at my 2 Candidates which are furthest into Progress with BOINC V4.19 and hadsm3 V4.13 :

41.4% complete : Phase 1, Trickle 9
51.4% complete : Phase 1, Trickle 11

So do I understand correctly, that the BOINC V4.19 Progress report is accidentally counting only for Phase 1, and that BOINC will therefor download an additional Model each shortly prior Phase 2 (nearing 100% indicated, but only 33% actual completion) ?
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 12330 · Report as offensive     Reply Quote
Profileold_user5994

Send message
Joined: 31 Aug 04
Posts: 239
Credit: 2,933,299
RAC: 0
Message 12331 - Posted: 6 May 2005, 9:53:11 UTC - in response to Message 12330.  

&gt; So do I understand correctly, that the BOINC V4.19 Progress report is
&gt; accidentally counting only for Phase 1, and that BOINC will therefor download
&gt; an additional Model each shortly prior Phase 2 (nearing 100% indicated, but
&gt; only 33% actual completion) ?

Well, I have been using some of the later versions of BOINC and have not seen that at all. Nor, while running 4.19 see that either. So, there is something else going on, maybe, I think ... :)

It does seem odd that it is happening to you now though.
ID: 12331 · Report as offensive     Reply Quote
ProfilePurple Rabbit
Avatar

Send message
Joined: 1 Sep 04
Posts: 23
Credit: 5,543,951
RAC: 2,317
Message 12334 - Posted: 6 May 2005, 13:36:53 UTC - in response to Message 12330.  
Last modified: 6 May 2005, 14:00:39 UTC

I can't tell if my models wanted to download another one after Phase 1. I have BOINC set to not download more CPDN models (depleting). I saw the strange behavior and didn't want to put more models at risk. Unfortunately you can't do this with BOINC 4.19.

The one model that made it to Phase 2 (I killed the others upgrading to SUSE 9.3) transitioned properly except the estimated time went to some time in 2016 and the s/TS were very slow. This slowly came back to normal in a few days. I can't say more because the upgrade got this one too (sigh). My backups didn't work.
ID: 12334 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 12340 - Posted: 6 May 2005, 15:50:12 UTC - in response to Message 12334.  
Last modified: 6 May 2005, 16:39:57 UTC

Another oddity :

I have noted the Trickles to get considerably more Credit granted (anywhere between 150 and 190).

Newest example (easier to Track, since this Host was setup new) :
<a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=163609">Host 163609 Details</a>
<a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=163609">Host 163609 Results</a>

1 Trickle complete, should have 94.52 Credits.
But :
Total Credit reads : 189.04 (double)

-=-=-=-=-=-=-=-=-=-
Since I did not change any BOINC Clients (V4.19 for all Linux Systems), I can safely say that the recent, problematic behaviour started with hadsm3 V4.11.

I remember when my first Linux Box downloaded the New Application and I was wondering about the ~60% faster estimated completion time.
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 12340 · Report as offensive     Reply Quote
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 12342 - Posted: 6 May 2005, 17:21:03 UTC

Just checked your links. Total credit is back to normal.
Just as a guess, your trickle might have been sent to two servers at the same time.
I wonder if the problem is related to the doubling of credits on model completion.
It would be nice if we had more info on server status, etc.


Forum search Site search
ID: 12342 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 12346 - Posted: 6 May 2005, 18:05:34 UTC - in response to Message 12342.  
Last modified: 7 May 2005, 10:22:25 UTC

(?)

I just re-checked, and on the Host and Computer Overwiev Pages the surplus Credit is still displayed unchanged (189.04), while only the Single Trickle has been accounted for (94.52) in the Result Page...
I reloaded both Pages to avoid getting a cached readout, no change.

btw., 2 Servers ?
I was under the impression that there was only one primary Server the BOINC Client talks to.

--- edit ---
Checked again today, and now the Credit is back to normal :)
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 12346 · Report as offensive     Reply Quote
EclipseHA

Send message
Joined: 28 Aug 04
Posts: 42
Credit: 1,443,857
RAC: 0
Message 12392 - Posted: 7 May 2005, 22:54:11 UTC
Last modified: 7 May 2005, 22:55:56 UTC

Well, phase 1 completed in a bit over 11 days.

The precentage went back to zero when phase 2 started, and the really funny thing was the "to completion" time was 150,000 days when it started! (It's since dropped to under 1000). I wonder if this is tied to v 4.13 of the cruncher?

It does seem to be something tied to the WU or the linux cruncher, as I have a second box running the same base SW (same linix,same boinc) but have v 4.12 for crunching.. It's running normaly.

Oh.. The "weird WU" also had a new WU downloaded when it was about 98% done (with phase 1)
ID: 12392 · Report as offensive     Reply Quote

Message boards : Number crunching : Very weird WU.

©2024 cpdn.org