Message boards : Number crunching : Results being sent to multiple hosts now..?
Message board moderation
Author | Message |
---|---|
Send message Joined: 5 Aug 04 Posts: 186 Credit: 1,612,182 RAC: 0 |
Looking through the 'workunit' link for a lot of my recent results; eg. <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=254987">#254987</a>, I noticed that they're all being sent to multiple hosts within a few minutes of each other - this doesn't seem to be normal behaviour for CPDN..? |
Send message Joined: 5 Aug 04 Posts: 907 Credit: 299,864 RAC: 0 |
> Looking through the 'workunit' link for a lot of my recent results; eg. <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=254987">#254987</a>, > I noticed that they're all being sent to multiple hosts within a few minutes > of each other - this doesn't seem to be normal behaviour for CPDN..? > Hi well it alternates upload servers, i.e. with BOINC you have to know "a priori" which upload server to go to at the workunit generation stage. Unlike "old CPDN" which assigned an upload server at the very end of a run. |
Send message Joined: 17 Aug 04 Posts: 753 Credit: 9,804,700 RAC: 0 |
> Hi well it alternates upload servers, i.e. with BOINC you have to know "a > priori" which upload server to go to at the workunit generation stage. Unlike > "old CPDN" which assigned an upload server at the very end of a run. It's a very recent change, though. Given the high rate of failure with WUs at the moment, it greatly increases the chance of them being completed, but there is a significant probability of some models being done three or four times. Is this wanted? |
Send message Joined: 5 Aug 04 Posts: 186 Credit: 1,612,182 RAC: 0 |
As Andrew has written, it is a very recent change - looking through my list or results, prior to September 20th, only a single result was sent out, unless the work_unit threw an 'computing error'. My last 'single send' work unit was #240740, result #248824 - my next work unit #252730 was sent as result #s 246909, 246910, 246911 & 246912 between 05:16 & 05:19UTC on September 27th - all very close together. Looking through those last 10 results, I can't see a single one that is being still being processed by multiple machines though - ie. they all have only one set or zero recent trickles. <a href="http://www.nmvs.dsl.pipex.com/"><img src="http://boinc.mundayweb.com/cpdn/stats.php?userID=6&team=off&trans=off"></a> |
Send message Joined: 5 Aug 04 Posts: 178 Credit: 18,765,237 RAC: 43,911 |
This <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=254064">WU</a> is processed active by two different hosts ... <a href="http://www.boinc.dk/index.php?page=user_statistics&project=cpdn&userid=34"><img border="0" height="080" src="http://34.cpdn.sig.boinc.dk?188"></a> Supporting <b>BOINC</b>, because it is really a <b>great concept !</b> |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
> As Andrew has written, it is a very recent change - looking through my list or > results, prior to September 20th, only a single result was sent out, unless > the work_unit threw an 'computing error'. > > My last 'single send' work unit was #240740, result #248824 - my next work > unit #252730 was sent as result #s 246909, 246910, 246911 & 246912 between > 05:16 & 05:19UTC on September 27th - all very close together. > > Looking through those last 10 results, I can't see a single one that is being > still being processed by multiple machines though - ie. they all have only one > set or zero recent trickles. > > <a href="http://www.nmvs.dsl.pipex.com/"><img> src="http://boinc.mundayweb.com/cpdn/stats.php?userID=6&team=off&trans=off"></a> > Hi, Nick, My last five W/U were sent out at least four times; the first two of the five were also on 27 Sep. The most recent was 10 Oct (W/U#258382). A couple are currently being processed by two machines. Seems it was not a transient phenomenom. Jim We have met the enemy and he is us -- Pogo |
Send message Joined: 26 Aug 04 Posts: 6 Credit: 122,963 RAC: 0 |
I have read the reasons given for the sending of multiple copies of a work unit. I appreciate the difficulties making sure that it does not happen, and I appreciate the need to make sure that SOMEONE finishes the WU. But I feel as though I have been wasting my time for the last 2 weeks, working on a WU that was completed by someone else on 13th January (http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=214522) David Hatherly > > Hi, Nick, > > My last five W/U were sent out at least four times; the first two of the five > were also on 27 Sep. The most recent was 10 Oct (W/U#258382). > > A couple are currently being processed by two machines. > > Seems it was not a transient phenomenom. > > Jim > > > > We have met the enemy and he is us -- Pogo > |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
David, It wasn't an error it was deliberately done. The CP team are planning to write a paper on the differences they get sending out the same workunit to different computers. There has been a fair amount of discussion on why different computers produce different results. It seems as if it is to do with different maths libraries being used. Your results may well be different from other people crunching the same work unit. This does not mean that one is wrong, both are useful. They can probably be considered as members of an initial condition ensemble and larger ic ensembles are wanted. Visit BOINC WIKI for help And join BOINC Synergy for all the news in one place. |
Send message Joined: 23 Aug 04 Posts: 49 Credit: 183,611 RAC: 0 |
> It wasn't an error it was deliberately done. The CP team are planning to write > a paper on the differences they get sending out the same workunit to different > computers. > > There has been a fair amount of discussion on why different computers produce > different results. It seems as if it is to do with different maths libraries > being used. > > Your results may well be different from other people crunching the same work > unit. This does not mean that one is wrong, both are useful. They can probably > be considered as members of an initial condition ensemble and larger ic > ensembles are wanted. Exactly. Couldn't have said it better myself. Sylvia, Neil and Andrew Martin are the ones pushing this paper ahead. It's a good test case of using the database (trying to work towards a nice eSciencey sort of interface for scientists), it also tells us something about how machine/math library-dependent the model is, it acts as a sort of ic ensemble (where the results differ) and it allows us to address some fairly common questions we get at various seminars and conferences. It should be an interesting paper. Dave (still at work - trying to get some stuff done for the Exeter conference next week) |
©2024 cpdn.org