Message boards : Number crunching : Orphened Work Units...
Message board moderation
Author | Message |
---|---|
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
I have 3 work units which CPDN has as "in process" status but which are no longer visible in my BOINC CPDN task list. These have apparently resulted from processing failures on computer ID 1266353. (However they are not listed as CPDN processing errors). All three of these failed after sending three trickles sometime after Dec 17, 2016. Wondering if there is any way to determine how/why these kind of processing errors occur this way -- where the failure does not result an a CPDN error so that the work units can be resent. This seems to be a BOINC processing problem possibly caused by a CPU failure, power failure or computer restart. With the current status, these units will remain in "In Process" status until Nov 22, 2017 (one year after I received them) -- but obviously they can no longer be processed in the meantime. The three work units in this status are: hadam3p_eu_lxif_201611_3_482_010809922_1 hadam3p_eu_lygr_201611_3_482_010811158_0 hadam3p_eu_lum4_201611_3_481_010802543_1 I realize there have been discussions about these and Les has suggested we just don't worry about them...but if others are seeing this, perhaps we can somehow find the cause.... Thoughts anyone? Art Masson |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I sent another email a day or so back, and it's being discussed. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,706,848 RAC: 5,644 |
I have 3 work units which CPDN has as "in process" status but which are no longer visible in my BOINC CPDN task list. These have apparently resulted from processing failures on computer ID 1266353. Hi Art, Did you see any errors in the log file? Are you sure it was CPU failure? The ones reported in the other thread are mostly hadamp3_eu as yours (3 months models), but we haven't notice any (obvious) errors with them and these are reported by Les. |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
No..no error messages at all...they just "disappeared" after some failure on my machine...with no update/error status to let CPDN know... |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,706,848 RAC: 5,644 |
No..no error messages at all...they just "disappeared" after some failure on my machine...with no update/error status to let CPDN know... As far as I get it yours did not finish successfully but failed. You may try detach and reattach CPDN and check the orphaned WUs (just noticed reattaching to CPDN with https://www.cpdn.org did not work for me, but http://climateprediction.net |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
I think I'll wait to hear back from Les -- since he sent another note to the CPDN folks and perhaps they will have another suggestion. Would be great just to have a way on the web site to advise CPDN to reissue a work unit that has failed this way.... |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It would be best if not too many reposts of the new url are made. Too many hackers about. Using an Account manager was/is intended to make it easier to join multiple projects. If it doesn't work with BAM, then it's not doing it's job. ********* Manually, I'd suggest logging into the new address first. On the old account page, replace the url up to and including UK with the new part, and then try. It should ask you to log in. The only things that should stop you, are not having cookies enabled, and perhaps some anti malware settings. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,706,848 RAC: 5,644 |
If it doesn't work with BAM, then it's not doing it's job. I tried on all 3 machines not via BAM! (deliberately switched it off), but via Add project function of BOINC and it did not work with the SSL. Did not try to attach via BAM! After reattaching I switched back using BAM! I guess the manual instructions are for the site only not for attaching via BOINC, and I'm properly logged. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I think that the BOINC sign up may still have the old url. I'll ask. |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
Sorry if I'm being dumb...what is "BAM" ? |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,884,997 RAC: 4,577 |
Sorry if I'm being dumb...what is "BAM" ? It's an "account manager", which consolidates various interactions with a set of BOINC (and possibly other) projects. The Web site is here. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
OK, the new secure site is undergoing "stress testing". The BOINC sign up for it won't become active until enough people are running on the new site without problems. But probably few have found out about it yet. Certainly not the set-and-forget people. In the mean time, if anyone wants to Disconnect / re-attach to get rid of uncompleted tasks, and they also want to be on the new secure site, it's DIY time. Releasing old uncompleted tasks will not necessarily mean they'll get run by someone else, or be useful if they do. As I've said before, if a researcher doesn't get all of their data back in a reasonable time, there's nothing to stop them from putting out another batch with the missing bit's in it, and then ignoring any of the original tasks if/when they eventually show up. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,706,848 RAC: 5,644 |
Thanks Les, DIY reattaching for the moment will still be under the old URL, won't be? It might be useful, if possible, for very important messages to users to use BOINC Notices? (i.e when stress tests are over a detaching and reattaching message could make more people switch to SSL and clear up a bit the In Progress queue) |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
DIY reattaching for the moment will still be under the old URL, won't be? Yes. Then you'll need to change the url to the new one again. (which is the DIY part.) I don't use detach/reattach, so I'm guessing with a lot of this. The "testing" is waiting for people to break it. Or complain about something. Which means that people need to be using the new url. So far, the only problem has been with this detach business. |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
I did wonder about editing account_climateprediction.net.xml <account><master_url>http://climateprediction.net/</master_url><authenticator>blankedoutforobviousreasons</authenticator><project_name>climateprediction.net</project_name> But thought probably best to try it out first on a machine which is out of work. Though could try it on resurrected ageing net-book which only has one six month old task running. Not sure if there are other places where it might need changing as well though. Edit: I see looking further down that file there are a lot of places to replace http with https in the file. |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
Hmmm...I think I'll hold off detaching/reattaching until the new secure site is fully up/running. I'm not into DIY much and wouldn't want to lose the processing on CPDN WUs in process. Does this make sense, Les? Art |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I don't really know, as I don't need to do that. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Has anyone been having problems with the new site/url? |
Send message Joined: 3 Sep 04 Posts: 126 Credit: 26,610,380 RAC: 3,377 |
These tasks have been reported on 21 January but are still marked as in progress on the website: https://www.cpdn.org/cpdnboinc/result.php?resultid=20118807 (wah2_nawa25_a27i_209912_13_491_010823332_0) https://www.cpdn.org/cpdnboinc/result.php?resultid=20117448 (wah2_nawa25_a15r_209512_13_491_010821973_0) https://www.cpdn.org/cpdnboinc/result.php?resultid=20118957 (wah2_nawa25_a1hs_209612_13_491_010822406_1) What's the new url? http://www.climateprediction.net/getting-started/ and Boinc still show the old url. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
What's the new url? The first part is https://www.cpdn.org See my post way down near the start of this thread. |
©2024 cpdn.org