Message boards : Number crunching : Problems after SAP merger into CPDN
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 7 Aug 04 Posts: 83 Credit: 410,895 RAC: 0 |
Chinook, that CPDN task is a HADCM and it successfully sent a trickle to a CPDN server less than 24 hours ago... Yes, my problem problem is that the SAP task was transmogrified from reporting to SAP to CPDN which now gives me 2 CPDN tasks which seem to be, partially, interlinked. The CPDN (HADCM) seems to update properly. The ex-SAP (HADAM) seems not to. As I previously posted... updating either one results in only the ex-SAP (HADAM) issuing the update - to the CPDN site, not the SAP site from where it was issued. The ex-SAP (HADAM) task is not shown on my CPDN account, only on my SAP account. Bizarrely, today, both tasks only go to the SAP site whereas, yesterday, both were going to the CPDN site - even though only climateprediction.net shows on the web-site button for both tasks. If no-one has anything against it, I\'ll keep the ex-SAP (HADAM) suspended until someone figures out out to remove the cross-links or informs me I should abort it, and resume the CPDN (HADCM) task. At least the trickles are working, the issued credit is being reported to statistic sites, such as BOINCstats, even though no updated credit is showing up on this BOINC Manager any more (115871 here, 116182 BOINCstats). I have not checked on the other computer. Hopefully, before the CPDN task is completed, the cross-linkage will be resolved or the SAP unit can be aborted so I can issue a No NEW Tasks command (only applies to the ex-SAP at the moment) and then reset the project once the task is finished. I also will not upgrade the BOINC Manager until that point as I can well imagine that would screw thing up royally - presently using v5.10.45. If I read nothing to the contrary, I\'ll resume the CPDN (HADCM) task tomorrow (Saturday) or earlier if you concur. -ChinookFöhn |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Milo has done all that he can for this week. The department housing the SAP server has shut down for the weekend and there\'s no access until Monday. You\'re not the only one that received a model in the last couple of months, and the \"top\" 4 crunchers have dozens of models on their machines, which possibly don\'t get looked at often, so they might not be aware that there\'s a problem. Best that I can say is: Good Luck. |
Send message Joined: 7 Aug 04 Posts: 83 Credit: 410,895 RAC: 0 |
Milo has done all that he can for this week... Good Luck. Thanks for the thought but luck, I believe, won\'t have much to do with it, just analytical reasoning. The ex-SAP unit isn\'t due until November so there are a few months of time to try to solve it. Don\'t know if it is worth it other than as an intellectual exercise and for the knowledge it would bring for if no solution becomes available next week, then aborting and re-issuing the units, likely, is a faster method of obtaining the results. As for me, I am content to hold the the ex-SAP task until the end of July at which time, if a solution is found, then crunch it and something like Milky Way tasks, \'til it completes, and then the CPDN task with Milky Way until it completes, and then reset both CPDN and SAP on this computer. Have a nice weekend. -ChinookFöhn |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The November \"due date\" is artifical; something HAS to be put into that field for the other projects, and this project uses one a long time into the future. The REAL due date was last year, before the project person\'s thesis was written. The models now slowly being completed will go into the collection with the others, for future researchers to use. The only think that will happen if you go past November, will be a message saying that the model is overdue, and to consider aborting it. Which isn\'t necessary. This applies to ALL climate models. |
Send message Joined: 9 Jan 05 Posts: 30 Credit: 434,469 RAC: 0 |
My cpdn model has been suspended since before SAP broke, due to cpdn server problems and non-cpdn circs. SAP is now suspended on all my hosts, but it\'s still named climateprediction.net so if I force an update I still get the you\'re-already-attached-to-cpdn-please-detach messages. So I wasn\'t clear if the redirect has truly been fixed... Does it appear safe to let normal cpdn models continue to run and contact the server, or do I need to wait for SAP to recover and the duplicate identity to go away first? |
Send message Joined: 7 Aug 04 Posts: 83 Credit: 410,895 RAC: 0 |
...So I wasn\'t clear if the redirect has truly been fixed...Nope. Does it appear safe to let normal cpdn models continue to run and contact the server, or do I need to wait for SAP to recover and the duplicate identity to go away first? All I can recommend is that you do as I did and look at your CPDN task under your computer and see if it did a trickle since the attempt to merge SAP into CPDN. If it did/does, I would say yes. If it hasn\'t, and it is a HADAM unit... I\'d recommend you wait \'til what is accomplished next week. Of course my advice could be totally in error. As I\'ve read nothing to the contrary, I shall re-start my CPDN (HADCM) in the morning. -ChinookFöhn |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
1) I was issued with a WU from SAP server this month. 2) Both my projects WU\'s have been effected. What i can do but i don\'t have a spare machine, is to split both the projects from the combined folder into two different folders (can be done) then check up, but, alas will have to wait a month. Stuck i suppose :\'( 5 WU\'s of SAP and 3 160 year models? what a shame, i watch them grow, like i watch my kids grow. Regards Masud. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Thyme Lawn is experimenting with small edits to the project account XML file and client_state.xml to put things back where they should be. It seems to be fixing the problem on my affected PC, in any case... I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
The redirect caused the project name and scheduler URL for SAP to be changed to those for the main CPDN project. After testing some ideas out with MikeMarsUK the following sequence will sort out the problem. Edit: the instructions have been changed because they relied on forcing BOINC to do a master file fetch before sending another scheduler request. This didn\'t always happen, causing the request to go to the CPDN server instead of the SAP server and undo all of the changes. The new instructions are here. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
Thyme i did as you suggested it started of Ok but reverted back. 5/16/2008 7:03:22 PM|CPDN Seasonal Attribution Project|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 0 completed tasks 5/16/2008 7:03:27 PM|climateprediction.net|Scheduler request succeeded: got 0 new tasks 5/16/2008 7:03:27 PM|climateprediction.net|You used the wrong URL for this project 5/16/2008 7:03:27 PM|climateprediction.net|The correct URL is http://climateprediction.net/ 5/16/2008 7:03:27 PM|climateprediction.net|You seem to be attached to this project twice 5/16/2008 7:03:27 PM|climateprediction.net|We suggest that you detach projects named climateprediction.net, 5/16/2008 7:03:27 PM|climateprediction.net|then reattach to http://climateprediction.net/ 5/16/2008 7:03:27 PM|climateprediction.net|Already attached to a project named climateprediction.net (possibly with wrong URL) 5/16/2008 7:03:27 PM|climateprediction.net|Consider detaching this project, then trying again 5/16/2008 7:03:27 PM|climateprediction.net|Message from server: Invalid or missing account key. Visit this project\'s web site to get an account key. Just to keep life simple, what if we consider these as crashed WU\'s and re-run from back up? Regards Masud. |
Send message Joined: 7 Aug 04 Posts: 83 Credit: 410,895 RAC: 0 |
The same occurred with me other than... WARNING! Do not restart any tasks if you have SAP tasks merged into CPDN. as my original CPDN (HADCM) started up, started looking for HADAM data and errored out. Of course I can not update the task to rid BOINC Manager of it as only the ex-SAP [HADAM] task issues updates. I do not think it is worth trying to find a fix as the data must be corrupted. Whether there is any value in obtaining the knowledge of how to correct this error when it seems obvious that the data in the intermingled work units is highly suspect. That my CPDN task issued a correct trickle, once, after the fiasco must have been an anomaly. I too vote for aborting all affected units and issuing a re-set project both to CPDN and SAP. A shame but if the data is important, then it seems to me that it should be re-issued. -Chinookföhn |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi sTrey It would be safer also to keep your CPDN model(s) suspended for the time being and crunch something else instead. If I were in your position I\'d back up the contents of the BOINC folder now, or certainly before restarting any climate models. And I wouldn\'t restart any of them until Milo tells us what the situation is on Monday. Cpdn news |
Send message Joined: 3 Mar 06 Posts: 96 Credit: 353,185 RAC: 0 |
I think any manual change to client_state.xml reverts back unless you delete client_state_previous.xml before restarting BOINC. Could that be the reason Thyme Lawn\'s procedure doesn\'t seem to work? I would wait for Thyme Lawn to verify my theory. I may be wrong. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Last year when my BBC model crashed with a \'Max CPU time exceeded\' message and I increased the fpops_bound figure in the xml file to give it more time to complete, I watched to see whether the figure reverted later. It didn\'t revert, so the client_state_previous file didn\'t need to be edited. The procedure I used for editing the xml file is described here for Windows: http://www.climateprediction.net/board/viewtopic.php?t=7215 The edit procedure has now been tested by several members (for that xml file of course) and it definitely works. I wonder whether members are omitting some step of the procedure and this is causing the edit to revert later? Or perhaps some edits revert and some don\'t. Cpdn news |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
I think any manual change to client_state.xml reverts back unless you delete client_state_previous.xml before restarting BOINC. Could that be the reason Thyme Lawn\'s procedure doesn\'t seem to work? client_state_prev.xml only comes into play if client_state.xml is corrupt. The fix failed for KAMasud (and probably chinooffoehn) because BOINC didn\'t issue the expected master file fetch before doing the scheduler update. I\'ve asked them to try a modification to the fix and have posted a modified set of instructions here. When I have confirmation that the modified fix works I\'ll modify the instructions here and on the SAP forum. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 7 Aug 04 Posts: 83 Credit: 410,895 RAC: 0 |
... The instructions did work, CPDN updated and SAP is back where it is supposed to be. The only difference I had, was that there was no change in my account_attribution_.cpdn.org.xml file. Until I am informed otherwise, I am am leaving the SAP task suspended. Alas, the CPDN task errored out and was lost. A rather novel experience but had I my druthers... Thank you Thyme Lawn for the correction and knowledge. -ChinookFöhn |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
Hi Thyme, again did as you have suggested but the project name in Boinc Manager is still pointing towards CPDN Main? i.e. climateprediction.net. Even though i have edoted the name in account folder as per advise. I backup twice a day, what if i replace Boinc folder with a clean backup folder. Should do the trick. I have not tried it as yet due to WU\'s from LHC and RS. Hello Dagorath/ Seinfeld, at last you found peace at some project. LoL. Wonder, what magic the Mods did on you :). To Mods. You all, are the real driving force behind these climate projects. Regards Masud. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
I think you need to edit the account name in both the project account file and also the client_state.xml file (in the attribution section). The same XML tag appears in both. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
Mike, i did make the changes in client_state.xml. The change in name occurs and every thing is Ok! until Boinc contacts the server. The name changes back? I open client_state.xml, it has reverted back. It seems that somehow the climateprediction.net genuine folder is controlling the ex SAP folder. I suspend climate and it suspends the ex SAP project, while it has forgotten all about its own WU\'s. I have to suspend them individually. Maybe Boinc is recording it some where? point me towards it and i can copy those contents, if that helps? Still, have a week of WU\'s from other projects, so no hurry. Regards Masud. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Thyme Lawn has modified his original instructions here: http://www.climateprediction.net/board/viewtopic.php?p=76424#76424 Cpdn news |
©2024 cpdn.org