Message boards : Number crunching : Exited with zero status- reason to worry?
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Hi guys, I\'m getting repeated \"exited with zero status\" messages for my model. The first one, which happened this afternoon (CET) is probably due to the fact that my computer synchronized its clock and went back three minutes- I remember the model reacting the same to daylight savings, without any further problems. But I\'ve been getting three more of those messages since then. Do I have to worry? I do have backups, but as the newest one is a couple of days old I don\'t want to reset if I don\'t have to. The model seems to have trickled normally despite the error messages. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
You can do one of two things: 1) Follow the instructions, and lose the model. 2) Leave it alone and maybe lose the model. And maybe not. The message is an advisory, not a warning, and is a left over from a very early version of BOINC, which is now being triggered by Windows for whatever reason. Or perhaps no reason at all. And you can always make another backup RIGHT NOW. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Okay thanks a lot. This means I\'ll go into the normal ignore mode for flaming Windoze apps ;-) and hope for the best. Maybe changing system time can trigger multiple messages of that kind instead of just one... |
Send message Joined: 23 Nov 05 Posts: 18 Credit: 407,491 RAC: 0 |
I have the same zero status msg and the time for completion is now ticking up !! What to do? Thanks DP 1/10/2007 3:21:50 PM|climateprediction.net|Task hadcm3lbm_91jc_25191163_0 exited with zero status but no \'finished\' file 1/10/2007 3:21:50 PM|climateprediction.net|If this happens repeatedly you may need to reset the project. 1/10/2007 3:21:50 PM||Rescheduling CPU: application exited 1/10/2007 3:21:50 PM|climateprediction.net|Restarting task hadcm3lbm_91jc_25191163_0 using hadcm3lb version 508 1/10/2007 3:31:36 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 1/10/2007 3:31:36 PM|climateprediction.net|Reason: To send trickle-up message 1/10/2007 3:31:36 PM|climateprediction.net|(not requesting new work or reporting completed tasks) 1/10/2007 3:31:45 PM|climateprediction.net|Scheduler request succeeded 1/10/2007 3:43:50 PM|climateprediction.net|Task hadcm3lbm_91jc_25191163_0 exited with zero status but no \'finished\' file 1/10/2007 3:43:50 PM|climateprediction.net|If this happens repeatedly you may need to reset the project. 1/10/2007 3:43:50 PM||Rescheduling CPU: application exited 1/10/2007 3:43:50 PM|climateprediction.net|Restarting task hadcm3lbm_91jc_25191163_0 using hadcm3lb version 508 1/10/2007 3:43:51 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 1/10/2007 3:43:51 PM|climateprediction.net|Reason: To send trickle-up message 1/10/2007 3:43:51 PM|climateprediction.net|(not requesting new work or reporting completed tasks) 1/10/2007 3:43:56 PM|climateprediction.net|Scheduler request succeeded 1/10/2007 3:49:51 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 1/10/2007 3:49:51 PM|climateprediction.net|Reason: To send trickle-up message 1/10/2007 3:49:51 PM|climateprediction.net|(not requesting new work or reporting completed tasks) 1/10/2007 3:49:56 PM|climateprediction.net|Scheduler request succeeded |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Make a backup right now, and then keep your fingers crossed. It\'ll either crash, or it won\'t crash. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Just to get back: Mine didn\'t. Has been running like a dream ever since. dp, do you synchronize your clock over the Internet? It\'s standard in Win XP I think, if you\'re using that (like most people). From my experiences that is really likely to trigger this error message- small cause, big effect. Just a guess of course, but I have experience that makes it seem quite probable to me. EDIT: Yep I just checked ;-) XP on both boxes... you can disable the \"sync clock\" feature by using XP AntiSpy, if you think it is the cause... I didn\'t, cause my system time tends to go completely wrong if I do, and besides, the error messages never did any serious harm to my model. |
Send message Joined: 23 Nov 05 Posts: 18 Credit: 407,491 RAC: 0 |
EDIT: Yep I just checked ;-) XP on both boxes... you can disable the \"sync clock\" feature by using XP AntiSpy, if you think it is the cause... I didn\'t, cause my system time tends to go completely wrong if I do, and besides, the error messages never did any serious harm to my model. [/quote] Thanks I\'ll try it DP |
Send message Joined: 23 Nov 05 Posts: 18 Credit: 407,491 RAC: 0 |
So Am i wasting time and resources? I have another host (Centrino XP)that seems to work fine, On this athlon host I shut off the clock sync and started CPDN again and this is the result Thanks in advance DP 1/12/2007 2:30:12 PM|climateprediction.net|Resuming task hadcm3lbm_91jc_25191163_0 using hadcm3lb version 508 1/12/2007 4:37:30 PM||Rescheduling CPU: application exited 1/12/2007 4:37:30 PM|climateprediction.net|Computation for task hadcm3lbm_91jc_25191163_0 finished 1/12/2007 4:37:30 PM|rosetta@home|Resuming task 1fvv_1_NMRREF_1_1fvv_1_yyidrenum_14IGNORE_THE_REST_0001_1475_9479_0 using rosetta version 543 1/12/2007 4:37:31 PM|climateprediction.net|Unrecoverable error for result hadcm3lbm_91jc_25191163_0 (<file_xfer_error> <file_name>hadcm3lbm_91jc_25191163_0_15.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3lbm_91jc_25191163_0_16.zip</file_name> <error_code>-161</error_code></file_xfer_error>) 1/12/2007 4:37:31 PM|climateprediction.net|Deferring scheduler requests for 1 minutes and 0 seconds 1/12/2007 4:38:31 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 1/12/2007 4:38:31 PM|climateprediction.net|Reason: To fetch work 1/12/2007 4:38:31 PM|climateprediction.net|Requesting 8640 seconds of new work, and reporting 1 completed tasks 1/12/2007 4:38:36 PM|climateprediction.net|Scheduler request succeeded 1/12/2007 4:38:36 PM|climateprediction.net|No work from project 1/12/2007 4:38:36 PM|climateprediction.net|Deferring scheduler requests for 1 minutes and 0 seconds 1/12/2007 4:39:36 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 1/12/2007 4:39:36 PM|climateprediction.net|Reason: To fetch work 1/12/2007 4:39:36 PM|climateprediction.net|Requesting 8640 seconds of new work 1/12/2007 4:39:41 PM|climateprediction.net|Scheduler request succeeded 1/12/2007 4:39:41 PM|climateprediction.net|No work from project 1/12/2007 4:39:41 PM|climateprediction.net|Deferring scheduler requests for 1 minutes and 0 seconds 1/12/2007 4:40:42 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 1/12/2007 4:40:42 PM|climateprediction.net|Reason: To fetch work 1/12/2007 4:40:42 PM|climateprediction.net|Requesting 8640 seconds of new work 1/12/2007 4:40:47 PM|climateprediction.net|Scheduler request succeeded 1/12/2007 4:40:47 PM|climateprediction.net|No work from project |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Well an \"unrecoverable error\" sounds definitely much worse. I fear in your case the model wasn\'t just having an allergic reaction to time sync :-( might have been but now it looks to me like you lost it... |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
... unless you have a full backup of the boinc folder. There are no other error messages logged in the Model\'s record, such as Negative Pressure or Negative Theta. So, it should be a good candidate to get through on the backup copy. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 23 Nov 05 Posts: 18 Credit: 407,491 RAC: 0 |
FWIW I noticed the last succesful upload was 1800 on 31 dec 2006 then nothing after the new year.??? I did make a bu of the cpdn directory 2 days ago and will try to restore it. Seems a shame to lose a model after 2500hrs of cpu time DP |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
There are no guarantees that say a model will always complete the full 160 years. A parameter set may go wonky at any point in the model. This includes at 161 years, or 200 years, etc. Just as the real weather can suddenly \'turn nasty\', and produce a tornado in the middle of a nice day, while people are out having a picnic. Lots have been lost in the last few years of the modelling, but the info is still useful. |
Send message Joined: 23 Nov 05 Posts: 18 Credit: 407,491 RAC: 0 |
Lots have been lost in the last few years of the modelling, but the info is still useful. Thanks for the encouragement I\'ll start new one DP |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,696,681 RAC: 10,226 |
I got the \'No heartbeat from core client - exiting\' message on a brand new machine, and traced it to the clock synchronisation - the clock had jumped forward by about two minutes, so of course that was longer than the 30 seconds that BOINC is prepared to wait. But it re-started automatically, and I didn\'t lose any work on any of the projects I had running. You can get an idea of how good your computer is at keeping time by looking at the \'Internet time\' tab in the Date and Time Properties control panel (double-click on the system clock to open it). The \'Next Synchronisation\' line tells you what the clock said before it was changed: the \'successful synchronisation\' line tells you what it changed to after synchronisation. If it\'s jumped forward, then you may have this problem. Because mine drifts by about two minutes per week, I\'m thinking of installing one of those clock tools that can check every 24 hours. That would keep the incremental changes down below 30 seconds, so BOINC should be happier. Does that seem to make sense to anyone? [NB to corporate users: if your Windows XP Pro computer is part of a Windows Active Directory domain, it\'s time setting is held constant by a local domain controller, and you don\'t get a chance to synchronise it yourself: there is no \'Internet time\' tab in the control panel. You just have to hope that your system administrator has remembered to set up the server time service to synch the whole network from a reliable source....] |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
As long as it jumps forward less than 30 seconds that should help. A worst problem is if it jumps back in time, even 1 second will cause a \'zero status exit\' type message. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,696,681 RAC: 10,226 |
In the end, it didn\'t need an external tool (sucking up those precious cycles...): MS KB314054 has all the gory details. I just changed one value in the registry: HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\W32Time\\TimeProviders\\NtpClient\\ In the right pane, right-click SpecialPollInterval, and then click Modify. - then re-start the Windows Time Service. WARNING: Don\'t mess around with the Windows registry unless you already know what you\'re doing: a single slip can cause problems much worse than the one you\'re trying to solve. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Well, my box is just too unreliable in keeping time for deactivating that... though I definitely long for a nice NTP service... Somehow I have the feeling I\'m not as content with Windows as I used to be ;-) Well, anyway, I ask you to keep your fingers crossed as I didn\'t crash a model but instead my whole Windows installation... no idea how and why, guess it just went on strike after 5 months of too much experimentation ;-) Only thing that helped was doing a complete reinstall, and that\'s the problem. I have a brand-new backup of my BOINC folder but I never actually had to use one of those so I\'m a bit uneasy about it. If someone has advice about it it would be very much appreciated, I guess I\'m too tired for sth like this now anyway (it\'s almost 3 am here) so I\'ll wait until tomorrow and do it then... |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Half down Les\' post: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=4890&nowrap=true#25344 "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,696,681 RAC: 10,226 |
Well, my box is just too unreliable in keeping time for deactivating that... The registry hack is for changing the time interval between updates - making it more active! SpecialPollInterval is in seconds, so you can set it to correct the clock as often as you like - though I think the NTP admins would get very annoyed with you if you connected too often. I\'m trying once per day. Good luck with the reinstall. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
@astro: Thanks a lot for the information :-) I\'ll read the post behind the link and then do my best. Richard: My bad, it really was a bit late I guess ;-) in that case I\'ll definitely try that hack as soon as everything else is up and running. Thanks for keeping your fingers crossed, that can never hurt. |
©2024 cpdn.org