Questions and Answers : Windows : More exit code -5 and status messages
Message board moderation
Author | Message |
---|---|
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Well, halfway through the last trickle of phase 2 (while I was asleep), my AMD64 3200+ 512 MB DDR400 WinXP Home system uploaded it's model with exit code -5. Which, according to this thread http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=187 is the catchall error for "model crashed". This surprised me no end as going into BOINC with this computer, I had run Prime95 and memtest86+ each for 24 hrs straight without errors (as well as several classic CPDN runs). Temperature in the room at the time of the crash was about 24C. The status messages that are recoverable by the user are very limited. See below. It would be nice to actually see the status messages of the details of the problem, the attempts to rewind (and to what point they rewound), etc. The status messages viewable by the the user from the "classic" model were more verbose. 2004-08-21 01:37:26 - Unrecoverable error for result 03qe_000029823_0 ( - exit code -5 (0xfffffffb)) 2004-08-21 01:37:26 - Deferring communication with project for 1 minutes and 0 seconds 2004-08-21 01:37:26 - Computation for result 03qe_000029823 finished 2004-08-21 01:37:26 - Started upload of 03qe_000029823_0_1.zip ...more uploading messages... |
Send message Joined: 5 Aug 04 Posts: 907 Credit: 299,864 RAC: 0 |
that's too bad, it can't be an AMD64 thing because mine just went into phase 3 today when I was out on a hike. The "fun error messages" are in the yabsd.out which is part of the upload on a crash; I checked yours out on the server and it looks like what honza got: NEGATIVE PRESSURE AT POINT 2780 NEGATIVE PRESSURE AT POINT 2781 NEGATIVE PRESSURE AT POINT 2782 NEGATIVE PRESSURE AT POINT 2783 ********************************************************************************* Model aborted with error code - 1 Routine and message:- P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. ********************************************************************************* I'm going to cross-reference to see if I can validate this is from a parameter set that may have caused the crash, or something else. The bad thing is it doesn't seem to have rewound first the day, then month, then year as it should (and I don't believe honza's run did either). I suppose you wouldn't have a backup of this run before it crashed & uploaded do you? |
Send message Joined: 5 Aug 04 Posts: 390 Credit: 2,475,242 RAC: 0 |
> I'm going to cross-reference to see if I can validate this is from a parameter set that may have caused the crash, or something else. > > The bad thing is it doesn't seem to have rewound first the day, then month, then year as it should (and I don't believe honza's run did either). I suppose you wouldn't have a backup of this run before it crashed & uploaded do you? > Hi all, i'm trying another BOINC model on my main machine so i can better monitor it's progress and behaviour. I also don't think that my recent 'exit code -5' models performed any rewind. I wish Martin's CPFarmView worked under BOINC - we would have been clear about rewinding (or extreme climate). So far so far (only first trickle) but... this WU 8681 has strange cold cells over Africa. I'm again going to regular backup scenario like in early classic beta last year - thanks Carl for reminding. <IMG src="http://cpdn.tuxie.org/honzacholt/CPDN_BOINC/BOINC_8681_Cold.png"> <img src="http://boinc.mundayweb.com/cpdn/stats.php?userID=56&trans=off"> |
Send message Joined: 7 Aug 04 Posts: 187 Credit: 44,163 RAC: 0 |
> that's too bad, it can't be an AMD64 thing because mine just went into phase 3 > today when I was out on a hike. Another AMD64 verification. My machine is 11 trickles into phase 3. (64 hours to completion) <a><img src="http://boinc.mundayweb.com/cpdn/stats.php?userID=18"></a> |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
> today when I was out on a hike. The "fun error messages" are in the yabsd.out > which is part of the upload on a crash; I checked yours out on the server and Thanks Carl. Could the yabsd.out be part of the archived data on the user's PC? Maybe it's cryptic and most people wouldn't want to see it, but I'm sure there are a few that would. Perhaps (not likely) something could be figured out by the users looking at these things. > The bad thing is it doesn't seem to have rewound first the day, then month, > then year as it should (and I don't believe honza's run did either). That's what I figured, but I wasn't sure. > suppose you wouldn't have a backup of this run before it crashed & > uploaded do you? > Unfortunately no. You get used to stability and take it for granted. Time for a reality check. ;) |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
> > that's too bad, it can't be an AMD64 thing because mine just went into > phase 3 > > today when I was out on a hike. > > Another AMD64 verification. My machine is 11 trickles into phase 3. (64 hours > to completion) I didn't figure it was because in the link I posted, the victim/offending PCs were P4s. |
Send message Joined: 5 Aug 04 Posts: 907 Credit: 299,864 RAC: 0 |
OK, I discovered the silly error I made that stopped you guys from rewinding, so that shouldn't happen again. It will do the model-day/month/year rewind on a crash provided you were far enough for month/year of course. So my apologies for that, it should haven given "another chance" although I'm not sure if it wouldn't have just hit that timestep with the "negative pressure" and crashed anyway. I'm a little mixed up though because in your original post it was the AMD64 that crashed, right? And I think that's what Honza is using. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
> OK, I discovered the silly error I made that stopped you guys from rewinding, > so that shouldn't happen again. It will do the model-day/month/year rewind on > a crash provided you were far enough for month/year of course. So my > apologies for that, it should haven given "another chance" although I'm not > sure if it wouldn't have just hit that timestep with the "negative pressure" > and crashed anyway. > Good to hear that the PC will be given another chance if errors occur. If it's a machine error, it might have gone through and continued. If a model parameter instability problem, then it would likely have just repeated the crash at the same point? > I'm a little mixed up though because in your original post it was the AMD64 > that crashed, right? And I think that's what Honza is using. It wasn't clear from Honza's first post what PC it was (since he has both P4s and an AMD64), but in his post in that thread from 17 Aug 2004 7:25:00 UTC it had to be a P4 since he was talking about downclocking to 3 GHz. |
Send message Joined: 5 Aug 04 Posts: 390 Credit: 2,475,242 RAC: 0 |
> > I'm a little mixed up though because in your original post it was the > AMD64 > > that crashed, right? And I think that's what Honza is using. > > It wasn't clear from Honza's first post what PC it was (since he has both P4s > and an AMD64), but in his post in that thread from 17 Aug 2004 7:25:00 UTC > it had to be a P4 since he was talking about downclocking to 3 GHz. > Hi guys, both machine running BOINC were P4, 3GHz, Win XP. My AMD64 is crunching classic THC - now phase 4, during winter under 5C average, ETA final upload in 24 hours. I guess i will start another classic model there. <img src="http://boinc.mundayweb.com/cpdn/stats.php?userID=56&trans=off"> |
©2025 cpdn.org