Thread 'daily quota exceeded on one result (HT)'

Author	Message
old_user17283 Send message Joined: 13 Sep 04 Posts: 3 Credit: 40,250 RAC: 0	Message 5039 - Posted: 4 Oct 2004, 12:41:44 UTC Last modified: 4 Oct 2004, 17:22:49 UTC One of my results can't download any more work - has been like this for several days now, while the other one continues quite happily. According to the messages, the server has refused to send any (due to the quota being exceeded) for those several days - so the message would appear to be untrue. The work tab now only shows progress on one result - the one that's not producing error messages. I assume this is consistent. What's the best way to untwist its knickers? btw it's boinc beta 4.05 on a P4 3.02GHz (HT) & winXP sp1 ID: 5039 · Reply Quote

Thyme Lawn Volunteer moderator Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0	Message 5053 - Posted: 4 Oct 2004, 17:29:23 UTC What error messages are you getting on the download attempts before the quota is exceeded, hUJe John? Was there anything happened on your system when the model crashed (e.g. powere failure, system reboot or user logout when BOINC was running)? <br><a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a> ID: 5053 · Reply Quote

old_user17283 Send message Joined: 13 Sep 04 Posts: 3 Credit: 40,250 RAC: 0	Message 5069 - Posted: 5 Oct 2004, 10:54:45 UTC - in response to Message 5053. > What error messages are you getting on the download attempts before the quota > is exceeded, hUJe John? Was there anything happened on your system when the > model crashed (e.g. powere failure, system reboot or user logout when BOINC > was running)? > <br><a href="http://www.teampicard.net"><img> src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a> href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join > us here</a> > No error msgs - in fact no download attempts - before the quota was exceeded on any given day... I just had a repetition of the same set of msgs reflecting insufficient work; request; no work available; deferring over a period of 40 hours, interspersed only with trickle for the other model. I first noticed there was a problem about three days after it had begun; there was a string of msgs - one per hour - going back. I didn't notice a different msg at the beginning but didn't look properly. At that time I needed to restart the system and during the shutdown an error msg was revealed behind other windows as they closed - I only saw it briefly but it was a Visual Fortran error. The download problem now seems to have resolved itself and it has got another (new) result to work on. It appears to have abandoned the model it was working on before... a bit of digging around and I found its directory, including: stderr_um.txt with contents: forrtl: The requested operation cannot be performed on a file with a user-mapped section open. and stdout_um.txt with contents: Starting hadsm3 model for ID# 260l_100122136... Changing to slots directory C:\Program Files\BOINC\slots\1 Model abandoned: UM has aborted the model Detaching shared memory, closing model... As it was about 30% through it seems a shame to abandon it if it can be restarted - is there a way of doing that? ID: 5069 · Reply Quote

Thyme Lawn Volunteer moderator Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0	Message 5070 - Posted: 5 Oct 2004, 11:32:10 UTC - in response to Message 5069. > I first noticed there was a problem about three days after it had begun; there > was a string of msgs - one per hour - going back. I didn't notice a different > msg at the beginning but didn't look properly. At that time I needed to > restart the system and during the shutdown an error msg was revealed behind > other windows as they closed - I only saw it briefly but it was a Visual > Fortran error. > > It appears to have abandoned the model it was working on before... a bit of > digging around and I found its directory, including: > stderr_um.txt with contents: > forrtl: The requested operation cannot be performed on a file with a user-mapped section open. Okay, by going to result id 221798 from your account page I can see that it finished with an exit code of -5. This is a catch all computation error in the model. Your stderr_um.txt file indicates what the error was, but only members of the project team would be able to shed some light on what happened. > As it was about 30% through it seems a shame to abandon it if it can be > restarted - is there a way of doing that? There's no need for you to worry about restarting the model. You can, in fact, safely get rid of the 260l_100122136 directory, 260l_100122136.xml file and 260l_100122136.zip file from your boinc/projects/climateprediction.net directory as the workunit has already been scheduled to be sent out again, as you can see <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=213714">here</a>. <br><a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a> ID: 5070 · Reply Quote

old_user1 Send message Joined: 5 Aug 04 Posts: 907 Credit: 299,864 RAC: 0	Message 5071 - Posted: 5 Oct 2004, 12:00:05 UTC I looked up this run on the upload server, I didn't see anything glaringly obvious as to why it crashed. Like a lot of other "-5" errors it's hard to pin down; my guess is a file somewhere got corrupted and the model flipped out (even 1 bad file out of 300 can cause a crash). I didn't see one of the error text files so maybe that was the one and since it couldn't make the error file it crashed. Also check disk space, I've seen -5 when people run out of space and it can't create any more files and just exits. ID: 5071 · Reply Quote

old_user17283 Send message Joined: 13 Sep 04 Posts: 3 Credit: 40,250 RAC: 0	Message 5116 - Posted: 6 Oct 2004, 19:28:07 UTC - in response to Message 5053. Thanks for the feed back - hopefully inexplicable errors won't become the norm for me. cheers ID: 5116 · Reply Quote