Message boards : Number crunching : HadCM3N - Error Messages on Completion
Message board moderation
Author | Message |
---|---|
Send message Joined: 4 Oct 09 Posts: 73 Credit: 7,242,427 RAC: 0 |
4 of my 5 HadCM3n full resolution ocean models apparently "completed" earlier today - Task Ids 12758267, 12740235, 12740230 and 12740229. Each uploaded the final trickle at time step 1,036,800. The first 3 have status marked as "completed" but the 4th (12740229) is marked as "error while computing". All 4 tasks ran constantly 24/7, none "failed" then restored from backup - the usual reason for a error status which does not get corrected when model eventually finishes. All have identical stderr messages. Link to the host's summary page - http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=1057969 My assumption is that all 4 are ok so the scientists will get clean results otherwise a waste of resource and time (started 21 days ago)! All those stderr messages are, if irrelevant, confusing to say the least. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
If you're talking about the "can't delete ..." parts, then that was/is just a permissions problem near the start of the installation of the models, or possibly during the backup restore. They're all font files for the graphics displays. As for "a waste of resource and time", the messages to look at are right near the top of each page: Over Backups: Here |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
12740229, the one that failed, has Signal 11 received, exiting...at the end of its stderr. The others don't. The Boinc FAQ doesn't seem very helpful. Other pages that Google brought up suggest it could be a permissions problem, a buffer overflow bug, or even overheating! Signal 11 seems to be a catch-all. |
©2024 cpdn.org