Questions and Answers :
Windows :
daily quota exceeded on one result (HT)
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Sep 04 Posts: 3 Credit: 40,250 RAC: 0 |
One of my results can't download any more work - has been like this for several days now, while the other one continues quite happily. According to the messages, the server has refused to send any (due to the quota being exceeded) for those several days - so the message would appear to be untrue. The work tab now only shows progress on one result - the one that's not producing error messages. I assume this is consistent. What's the best way to untwist its knickers? btw it's boinc beta 4.05 on a P4 3.02GHz (HT) & winXP sp1 |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
What error messages are you getting on the download attempts before the quota is exceeded, hUJe John? Was there anything happened on your system when the model crashed (e.g. powere failure, system reboot or user logout when BOINC was running)? <br><a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a> |
Send message Joined: 13 Sep 04 Posts: 3 Credit: 40,250 RAC: 0 |
> What error messages are you getting on the download attempts before the quota > is exceeded, hUJe John? Was there anything happened on your system when the > model crashed (e.g. powere failure, system reboot or user logout when BOINC > was running)? > <br><a href="http://www.teampicard.net"><img> src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a> href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join > us here</a> > No error msgs - in fact no download attempts - before the quota was exceeded on any given day... I just had a repetition of the same set of msgs reflecting insufficient work; request; no work available; deferring over a period of 40 hours, interspersed only with trickle for the other model. I first noticed there was a problem about three days after it had begun; there was a string of msgs - one per hour - going back. I didn't notice a different msg at the beginning but didn't look properly. At that time I needed to restart the system and during the shutdown an error msg was revealed behind other windows as they closed - I only saw it briefly but it was a Visual Fortran error. The download problem now seems to have resolved itself and it has got another (new) result to work on. It appears to have abandoned the model it was working on before... a bit of digging around and I found its directory, including: stderr_um.txt with contents: forrtl: The requested operation cannot be performed on a file with a user-mapped section open. and stdout_um.txt with contents: Starting hadsm3 model for ID# 260l_100122136... Changing to slots directory C:\Program Files\BOINC\slots\1 Model abandoned: UM has aborted the model Detaching shared memory, closing model... As it was about 30% through it seems a shame to abandon it if it can be restarted - is there a way of doing that? |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
> I first noticed there was a problem about three days after it had begun; there > was a string of msgs - one per hour - going back. I didn't notice a different > msg at the beginning but didn't look properly. At that time I needed to > restart the system and during the shutdown an error msg was revealed behind > other windows as they closed - I only saw it briefly but it was a Visual > Fortran error. > > It appears to have abandoned the model it was working on before... a bit of > digging around and I found its directory, including: > stderr_um.txt with contents: > forrtl: The requested operation cannot be performed on a file with a user-mapped section open. Okay, by going to result id 221798 from your account page I can see that it finished with an exit code of -5. This is a catch all computation error in the model. Your stderr_um.txt file indicates what the error was, but only members of the project team would be able to shed some light on what happened. > As it was about 30% through it seems a shame to abandon it if it can be > restarted - is there a way of doing that? There's no need for you to worry about restarting the model. You can, in fact, safely get rid of the 260l_100122136 directory, 260l_100122136.xml file and 260l_100122136.zip file from your boinc/projects/climateprediction.net directory as the workunit has already been scheduled to be sent out again, as you can see <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=213714">here</a>. <br><a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a> |
Send message Joined: 5 Aug 04 Posts: 907 Credit: 299,864 RAC: 0 |
I looked up this run on the upload server, I didn't see anything glaringly obvious as to why it crashed. Like a lot of other "-5" errors it's hard to pin down; my guess is a file somewhere got corrupted and the model flipped out (even 1 bad file out of 300 can cause a crash). I didn't see one of the error text files so maybe that was the one and since it couldn't make the error file it crashed. Also check disk space, I've seen -5 when people run out of space and it can't create any more files and just exits. |
Send message Joined: 13 Sep 04 Posts: 3 Credit: 40,250 RAC: 0 |
Thanks for the feed back - hopefully inexplicable errors won't become the norm for me. cheers |
©2024 cpdn.org