Message boards : Number crunching : Announcement: Database residual problem - misallocated WUs
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 29 Aug 04 Posts: 4 Credit: 125,007 RAC: 0 |
> > host id 165678 > > work id 47911 > > result id 719682 > > > > work unit 26sp_300123158_0 > > > > This is still running but result indicates done with client error > > That one's not a problem Allan. <a> href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/trickle.php?resultid=719682">That > result</a> is registered to your host and the other system that was running it > no longer exists (I guess the owner must have merged it). And there's no need > to worry about losing credits because of the first 46 trickles being sent by > the other system as you get the credits appropriate for your most recent > trickle. The other system was mine, I merged them when I discovered I had two identical hosts. |
Send message Joined: 4 Nov 04 Posts: 16 Credit: 11,577,003 RAC: 0 |
Well, this is what happened with <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=69015">host 69015</a> (see posts above in this same thread): Calculations for result <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=786821">786821</a> were completed, and for the look of it here, the upload went up without any problem. However, the plots for phase 3 do not appear in the result page. No science information seems to be lost as the <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/trickle.php?resultid=786821">trickles</a> do point to the host doing the crunching (69015). If the science team ever wants to look at the result they can find the right host there. 69015 is now crunching the next wu allocated (but the previous one is still 'in progress' as I reported above, even if it is, in fact, completed. Interestingly enough, now I find myself also at the opposite end of the problem. My host <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=129362">129362</a> got last night result <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=856358">856358</a>, and if you look at the <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/trickle.php?resultid=856358">trickles for 856358</a> you can see how this result appears to have been sent already to host <a>55941</a>, which is not mine. I ended up with a trickle's worth of credit right away as that host had upload already one. So now I am considering suspending temporarily result 856358 to see if the other host keeps at it (my conputer won't be idle meanwhile, as it is a multiprocessor machine). Does anyone know if sending a STOP signal to hadsm3um has any adverse effect? All this was running Linux, BOINC 4.19 and HADSM 4.13, by the way. Cheers, LS |
Send message Joined: 31 Aug 04 Posts: 2 Credit: 225,332 RAC: 0 |
Hi guys, same problem here. Result: 858515 Assigned host: 169983 calculating host: 25273 (mine) Have done 4 trickles already and now suspended the WU. I'm wondering what to do now as I don't want to crunch 100% seti for too long ;) Regards, MrSpadge |
Send message Joined: 6 Sep 04 Posts: 6 Credit: 195,123 RAC: 0 |
I think I have one of these: WU: 565036 Work Unit name: 2u1q_300153589_1 My Host ID: 22439 Host ID identified on results page: 48533 Completed 6 trickles - computer is showing that it is working on TS 75001 when I checked seconds ago, so a 7th trickle will come through shortly. If the crunching my computer is doing will actually be valuable - that is more important than getting the credit - and I will just let it keep going. Thanks for all your work. Lornix |
Send message Joined: 4 Sep 04 Posts: 14 Credit: 468,276 RAC: 0 |
Here is my solution to the problem - do a reset of the project...and all is well, WUs are registered and trickling...case closed Why you may ask? Well, I'm using 4.19 so I deleted the unregistered WU as instructed in the forum, only to be replaced by another unregistered WU, only compounding the problem... Since resources to fix the problem are scarce and there is no guarantee that the completed WU will be uploaded and saved, so I took my losses and reset the project:) |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
I added some more information to the first post. The last suggestion is very messy and people may well prefer to follow 'bosh's last post which is what I was hinting at by saying "If you have done less than a couple of hours of work on the WU then it is easiest and safest to just abort the run." (Didn't want to sound too dismissive of people's work.) Sorry I cannot give an estimate of when Tolu may be able to look at and consider the possibility of fixing the problem. |
Send message Joined: 28 Aug 04 Posts: 90 Credit: 2,736,552 RAC: 0 |
>If there is work done but not by you, please report your host id, the >ResultID and name, and the host number that has done the work. Hi, WU-Name: 2mpa_300143975_1 (<a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=555422">555422</a>) Result-ID: <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=857093">857093</a> Working on this unit: <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=5957">Host 5957</a> Shown in resultlist of (assigned to):<a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=86867">Host 86867</a> Ciao |
Send message Joined: 26 Aug 04 Posts: 2 Credit: 327,277 RAC: 0 |
Result ID: 817970 Workunit ID: 532100 Host ID: 163416 Workunit is currently 66% complete and not in my list of reults for this host. |
Send message Joined: 14 Aug 04 Posts: 13 Credit: 1,231,931 RAC: 0 |
name: 3puk_300195210, my Host ID: 161068 It had not trickled yet and I deleted it. |
Send message Joined: 4 Sep 04 Posts: 14 Credit: 468,276 RAC: 0 |
Just to provide a bit of an update to my previous post… A reset on my second PC, v4.19, Host ID 21990 did not produce the same results, but rather initially reported as "unsent" under "Server State" and then changed to Host ID 166993 (not mine), so this time I deleted WU, and finally CPDN gave me a WU registered to me in the "Results for Hosts". So in conclusion, it seems to be a random hit and miss...but the same result can be achieved by deleting WU repeatedly (less drastic measure), until satisfactory outcome is achieved. And perhaps most of you "geeks" already new this, but I sure did not, so with apology… :) PS. On the bright side, after the reset, my Host ID remained the same... |
Send message Joined: 31 Aug 04 Posts: 2 Credit: 225,332 RAC: 0 |
Guys, this starts to suck a bit. Aborted the wrong WU and got another wrong one: Result: 875418 Assigned host: 171513 Calculating host: 25273 Regards, MrS |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
As 'bosh said you may have to abort/reset a few times to get an unaffected WU. |
Send message Joined: 11 Sep 04 Posts: 12 Credit: 74,234 RAC: 0 |
host 30823 ResultID WorkUnitID State 870178 572867 aborted 841692 559944 not downloaded 841691 559943 not downloaded |
Send message Joined: 28 Aug 04 Posts: 69 Credit: 260,395 RAC: 0 |
|
Send message Joined: 9 Aug 04 Posts: 25 Credit: 4,756,979 RAC: 0 |
I just got some WU's on a couple of machines, but I don't see them listed in my "results" page (yet). Is there a delay before they show up? ----- Actually, two results showed up for one of my machines, but they are different than the ones I got. The other machine's WU's did not show up. I'm thinking I will have to abort them all and try again. ----- These are the problem WU's: Host: 6415 - Result ID: 880949 - Name: 3y7o_100206157 (not on machine) Host: 6415 - Result ID: 880941 - Name: 3y7g_100206149_0 (not on machine) Host: 6415 - Name: 3zvo_100208339_0 (not in results page) - aborting Host: 6415 - Name: 3zx7_100208394_0 (not in results page) - aborting Host: 1113 - Name: 3zx1_100208388_0 (not in results page) - aborting Host: 1113 - Name: 3xv6_100205703_0 (not in results page) - aborting ----- Interesting note: Results 880941 and 880949 are listed as being sent to me at an earlier time than I got the other four units. ----- Update: After aborting the 4 unlisted WU's, I got 1 WU per machine that DID show up on my results page. Whew! |
Send message Joined: 21 Oct 04 Posts: 24 Credit: 207,633 RAC: 0 |
Has somebody noted: that somebody else got the credits for the trickles? For example: Legoman ; did you obtain the credits for the 10 trickles from <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=553146">WU 553146 </a> or named as <b>2kyo_300141699</b>? Sorry but I got the 900 credits[>EDIT] meanwhile they are at 1600 credits[/EDIT], I don't know why and I want flame somebody also no offense! greetz from Switzerland littleBouncer |
Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0 |
Now I've got one of those too : 2y76_300159022_0 with resultid=853050 should be attached to hostid=142111 but it is somehow attached to hostid=131745 too. I guess, it has been crashed by hostid=131745 (BOINC 4.25), then it has been delivered to my hostid=142111 but the server "forgot" to create a second ResultID from that WU with wuid=571281 for me. So actually resultid=891806 would be mine I guess - but that's in "unsent" state. The other host hostid=131745 (the one with 4.25) crashes everything anyway, it has no trickles (except for the ones from my host), 161 results and 94 credits. edit : although the trickles appear for the foreign host, they show up in my trickle list too |
Send message Joined: 28 Aug 04 Posts: 90 Credit: 2,736,552 RAC: 0 |
> Has somebody noted: that somebody else got the credits for the trickles? Yes... <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=857093">there is a workunit</a> in my resultlist that someone else is computing for but the credit is added to my account. Ciao |
Send message Joined: 7 Aug 04 Posts: 2185 Credit: 64,822,615 RAC: 5,275 |
Result ID: <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=871755">871755</a> Result Name: 3raw_200197113_0 Problem: Not listed under results for my computer <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=53552">53553</a>, although I've completed 26 trickles. Aaargh. I wasn't paying attention. This is my first one like this. Any I wondered why my BOINCSTATS stats weren't increasing for this computer. Did a reset. |
Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0 |
Here's one more : http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=173538 15 results, no trickles but 283.55 credits. I wonder if it will ever be possible to recover all dependencies between model, host and trickles. |
©2024 cpdn.org