climateprediction.net (CPDN) home page
Thread '-161 at 93% completion'

Thread '-161 at 93% completion'

Questions and Answers : Windows : -161 at 93% completion
Message board moderation

To post messages, you must log in.

AuthorMessage
Profileold_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 22686 - Posted: 10 May 2006, 3:57:28 UTC

My last sulfur model WU 1103108 crashed this morning with a -161 error, at 93% completion. The last two lines in yabsd.out read

QT_POS : Mass weighted QT summed over level 16
was negative. WARNING: QT not conserved
1518 points were -ve and the scaling factor has been reset to 1
QT_POS : Mass weighted QT summed over level 16
was negative. WARNING: QT not conserved
1767 points were -ve and the scaling factor has been reset to 1

Not sure if this helps in any way to diagnose the problem.

Anyway, I have a backup from two days ago, but I am not sure if it is worth to run it again - will it crash again at the same point?

I also read in one of the sticky topics that a newer version of BOINC may complete the WU - is there anything to it? And if so, where can I download a 5.4 version?

Finally, is it worth at all to complete this WU (it takes only about a week or so), or is it already obsolete?
ID: 22686 · Report as offensive     Reply Quote
Profileold_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 22687 - Posted: 10 May 2006, 4:04:09 UTC


... where can I download a 5.4 version?

Found it at http://boinc.berkeley.edu/download.php?dev=1
ID: 22687 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 22689 - Posted: 10 May 2006, 5:42:02 UTC

BOINC has nothing to do with the running of the science app., no matter what the version. If it\'s going to crash, then it will. The only way to find out, is by running a model a few times from a backup, to see if it keeps crashing at the same point.

5.4 versions of BOINC are \'experimental\' versions for testing. Use at your own risk.

Sulphur models are obsolete, unless close to finishing.

ID: 22689 · Report as offensive     Reply Quote
Profileold_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 22690 - Posted: 10 May 2006, 6:49:41 UTC - in response to Message 22689.  

BOINC has nothing to do with the running of the science app., no matter what the version.

I have been wondering about this too, but I found that information in this topic.

Anyway, I am going to give it a try since it is just a few days to completion.
ID: 22690 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 22691 - Posted: 10 May 2006, 7:45:18 UTC

I think that was talking about a model that couldn\'t be recovered from a backup. And I\'m a bit sceptical about what was posted, as being the full story. There may have been some other \'hidden\' reason for the problem and fix, of which the original poster was unaware.

ID: 22691 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 22700 - Posted: 10 May 2006, 16:23:04 UTC

It looks like the energy balance of the model is messed up (\"QT not conserved\"), so that would be a science problem rather than a \"BOINC interaction\" sort of problem. It would be interesting to look at, i.e. was the model unstable at any point or did it just happen (the graphs aren\'t showing up right now though unfortunately)
ID: 22700 · Report as offensive     Reply Quote
Profileold_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 22725 - Posted: 12 May 2006, 5:26:33 UTC
Last modified: 12 May 2006, 5:39:44 UTC

Looks like this time it went successfully over the point where it crashed last time. Just sent up a new trickle.

P.S. BOINC 5.4.9 is now officially out (recommended version).
ID: 22725 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 22727 - Posted: 12 May 2006, 9:58:32 UTC - in response to Message 22725.  

a minor thing to \"beware\" about the new BOINC 5.4.9, is if you are attached to CPDN with a url other than \"climateprediction.net\" you will have troubles trickling & uploading, as BOINC now rigorously matches the project URL (to prevent spoofing of projects etc). This would usually apply if you are attached to \"www.climateprediction.net\" or directly to \"climateapps2.oucs.ox.ac.uk/cpdnboinc\" etc
ID: 22727 · Report as offensive     Reply Quote
Profileold_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 22900 - Posted: 25 May 2006, 8:25:24 UTC
Last modified: 25 May 2006, 8:26:21 UTC

Just to report the outcome of this issue: success! The WU has now been completed (after several restarts) and delivered a complete result 1704450.

However, the outcome still reads \'Client error\' - shouldn\'t this be corrected once a complete and correct result is uploaded?
ID: 22900 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 22901 - Posted: 25 May 2006, 10:06:19 UTC

Congrats on the success! <applause>

Unfortunately, the server labels on models get \'stuck\' on the first message to reach them. But the science data is safely stored away, and the researchers use other indicators to find the data they want. Probably xml queries, or something equally techy.

ID: 22901 · Report as offensive     Reply Quote
Profileold_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 22914 - Posted: 26 May 2006, 2:18:54 UTC

Thanks for the positive reply!

I wasn\'t so much concerned about that single WU, but the process in general. I am one of the people in this forum who is constantly nagging others to make backups, so it is important to me to know that recovering from a backup will eventually lead to a usable result.
ID: 22914 · Report as offensive     Reply Quote

Questions and Answers : Windows : -161 at 93% completion

©2025 cpdn.org