Message boards : Number crunching : Result exited with zero status?
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
Thanks to LHC having no work, this box is now only doing CPDN. Just yesterday I finally got to phase 2. Today I noticed some strange error messages that I have seen before with seti (or was it lhc?) but never before with CPDN. Anyone else seen this? ================================================ climateprediction.net - 2004-10-10 21:11:58 - Result 2vqa_000155790_0 exited with zero status but no 'finished' file climateprediction.net - 2004-10-10 21:11:58 - If this happens repeatedly you may need to reset the project. climateprediction.net - 2004-10-10 21:11:58 - Restarting result 2vqa_000155790_0 using hadsm3 version 4.03 climateprediction.net - 2004-10-10 21:30:01 - Result 2vqa_000155790_0 exited with zero status but no 'finished' file climateprediction.net - 2004-10-10 21:30:01 - If this happens repeatedly you may need to reset the project. climateprediction.net - 2004-10-10 21:30:01 - Restarting result 2vqa_000155790_0 using hadsm3 version 4.03 ================================================ The model appears to be still crunching away and making progress (at 27,000 in phase 2 so far) but this does not look promising. I have seen it at least 5 or 6 times now. May have missed a couple too. AMD 2400+ with a gig of RAM attached to LHC and CPDN BOINC v4.09 <br> ---------------------------- A member of <a href="team_display.php?teamid=45">The Knights Who Say Ni!</a> <a href="http://boinc-kwsn.no-ip.info">My BOINC stats site</a> |
Send message Joined: 10 Aug 04 Posts: 94 Credit: 309,849 RAC: 0 |
I had to junk one WU for the same reason. Got the same message. <img src="http://boinc.mundayweb.com/cpdn/stats.php?userID=35&trans=off"><a href="http://mysite.wanadoo-members.co.uk/thefinalfrontear/index.html"> Team Site Link</a>"The world is a progressively realized community of interpretation." |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
hmm... Well that isn't encouraging. Since it is still making progress (now up to 29,000 TS) I'm hoping I won't have to reset. I have already lost 3 models due to CPU/motherboard problems. I really want to finish one for once! :) <br> ---------------------------- A member of <a href="team_display.php?teamid=45">The Knights Who Say Ni!</a> Yet another stats page: <a href="http://boinc-kwsn.no-ip.info">http://boinc-kwsn.no-ip.info</a> |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
The error is generated whenever BOINC detects an abnormal termination of a project program (when a program terminates normally it should create a boinc_finish_called file in its slots directory). Is there anything in the stdout or stderr files in the BOINC or climateprediction.net/{result_id} directories to indicate what caused the CPDN model to stop running? <br><a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a> |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
The only thing in the boinc stderr.txt is some DNS errors thanks to major network problems my ISP had yesterday (down for about 9 hours). stderr_um.txt in projects\climateprediction.net\2vqa_000155790 has a bunch of lines that read "OPEN: File dataout/2vqaba.da27bs0 Created on Unit 22" Then there are these: CLOSE: WARNING: Unit 60 Not Opened OPEN: File dataout/2vqaba.pa28c10 Created on Unit 60 CLOSE: WARNING: Unit 63 Not Opened OPEN: File dataout/2vqaba.pd28c10 Created on Unit 63 and a few more like it. That is the only oddity I see anywhere. Up to TS 40,000 and I don't think the error came up at all last night. Very random it seems. <br> ---------------------------- A member of <a href="team_display.php?teamid=45">The Knights Who Say Ni!</a> Yet another stats page: <a href="http://boinc-kwsn.no-ip.info">http://boinc-kwsn.no-ip.info</a> |
Send message Joined: 7 Aug 04 Posts: 187 Credit: 44,163 RAC: 0 |
I see the "no finished file" error sometimes, but the WU always continues processing. <a href="http://www.boinc.dk/index.php?page=user_statistics&project=cpdn&userid=355"><img border="0" height="80" src="http://355.cpdn.sig.boinc.dk?188"></a> |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Hi, Toby, I had similar experience to Heffed on one machine early in Phase 1 of the current pair of Models. Since then, the machine seems to behave itself. I don't recall which machine, whether M$/XP or Linux, but all eight of my Models (four HT machines) are in Phase 2 or 3. So, I think the odds are in your favor for a successful run. (My fingers are crossed for all of us!) We have met the enemy and he is us -- Pogo |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
> (My fingers are crossed for all of us!) Sweet! I kind of need my fingers uncrossed for work so thanks for doing that for us. Maybe I'll knock on wood instead :) But it is good to hear that others have seen this and that it doesn't seem to be detrimental to the project. This is happening on my windowx XP SP1 (haven't had the balls to try SP2 yet :) box although I have seen the same error on my gentoo linux box (2.6.8) while running seti. <br> ---------------------------- A member of <a href="team_display.php?teamid=45">The Knights Who Say Ni!</a> Yet another stats page: <a href="http://boinc-kwsn.no-ip.info">http://boinc-kwsn.no-ip.info</a> |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
> The only thing in the boinc stderr.txt is some DNS errors thanks to major > network problems my ISP had yesterday (down for about 9 hours). stderr_um.txt > in projectsclimateprediction.net2vqa_000155790 has a bunch of lines that > read > "OPEN: File dataout/2vqaba.da27bs0 Created on Unit 22" > Then there are these: > > CLOSE: WARNING: Unit 60 Not Opened > OPEN: File dataout/2vqaba.pa28c10 Created on Unit 60 > CLOSE: WARNING: Unit 63 Not Opened > OPEN: File dataout/2vqaba.pd28c10 Created on Unit 63 There's nothing there to worry about, Toby. It's just a warning that hadsm3um is trying to close a file that it hasn't created yet. <br><a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a> |
Send message Joined: 5 Nov 04 Posts: 19 Credit: 88,724 RAC: 0 |
I'm having the same error on all my machines... :-( - 2004-11-17 12:32:52 [climateprediction.net] Result 3ev9_100180838_0 exited with zero status but no 'finished' file 2004-11-17 12:32:52 [climateprediction.net] If this happens repeatedly you may need to reset the project. 2004-11-17 12:32:52 [climateprediction.net] Restarting result 3ev9_100180838_0 using hadsm3 version 4.04 2004-11-17 14:52:43 [climateprediction.net] Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 2004-11-17 14:52:46 [climateprediction.net] Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded 2004-11-17 17:26:49 [climateprediction.net] Result 3ev9_100180838_0 exited with zero status but no 'finished' file 2004-11-17 17:26:49 [climateprediction.net] If this happens repeatedly you may need to reset the project. 2004-11-17 17:26:49 [climateprediction.net] Restarting result 3ev9_100180838_0 using hadsm3 version 4.04 2004-11-17 22:50:15 [climateprediction.net] Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 2004-11-17 22:50:19 [climateprediction.net] Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded 2004-11-18 06:21:31 [climateprediction.net] Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 2004-11-18 06:21:34 [climateprediction.net] Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded - This is the log from my first machine. The error has occured repeatedly (4 times) in the last couple of days. My second machine is running 4 units at the same time (dual Xeon with hyperthreading). Yesterday, it gave the same error for all units it is working on (after having spent 140 hours on each unit). On both machines, the stderr.txt is empty... Any suggestions? Jörg |
Send message Joined: 12 Nov 04 Posts: 3 Credit: 3,374 RAC: 0 |
I've seen it happen with my Seti WU's, it's just a BOINC thing i believe, and only happens when you start it up from it being shutdown. All i did was suspend the work, then changed it back to run. |
©2024 cpdn.org