Message boards : Number crunching : So what the hell caused Error -161?
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Aug 04 Posts: 6 Credit: 49,394 RAC: 0 |
So after 44 hours of my first ever sulphur model I get error -161? Any particular reason why this should occur? 21/12/2005 05:21:32|climateprediction.net|Unrecoverable error for result sulphur_dp30_000639036_0 (<file_xfer_error> <file_name>sulphur_dp30_000639036_0_1.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_dp30_000639036_0_2.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_dp30_000639036_0_3.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_dp30_000639036_0_4.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_dp30_000639036_0_5.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error>) 21/12/2005 05:21:36|climateprediction.net|Deferring communication with project for 56 seconds |
Send message Joined: 23 Feb 05 Posts: 55 Credit: 240,119 RAC: 0 |
So after 44 hours of my first ever sulphur model I get error -161? Any particular reason why this should occur? From the top of my head I would say it has to do with \'file transfer\' Looks your model crashed for some reason. Tried to send the files but reports failure in doing so. Eighter, the files where not generated after the crash or where corrupt. Price yourself luck that this did not happen 44 hours before completion! |
Send message Joined: 30 Aug 04 Posts: 77 Credit: 1,785,934 RAC: 0 |
I also already lost a number of Sulphur Models (using Linux BOINC V5.2.13) . 3x from one Host process got signal 11 <stderr_txt> free(): invalid pointer 0xbffff8d8! = sigsegv - segmentation violation It happened twice after 3,597.02s and 3,595.87s respectively, which means quite exactly at their first 60 Minutes Project cycle. 3rd Time it happened after 32,511.49s, which is again quite exactly the 9th cycle for CPDN in the Multi-Project environment. 4th time (same host) if just stated process got signal 11 (no other stderr details) after 360,344.25s, which means (again) at cycle time, 100h cycle in this case. Another Host had one with process got signal 11 as well, after 35,364.90s (closing into the 10h cycle). Looking at the times that almost always are a factor of the 1h Project cycle, I\'ll have a close look at the Systems. Both ran other Projects flawless so far, never had specific Problems with them (?) It\'s extremely annoying seeing those long-running Models crash out, taking down a considerable amount of CPU time with them :( Scientific Network : 44800 MHz - 77824 MB - 1970 GB |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
There is some hope for better times in Dave Frames post <a href=\"http://www.climateprediction.net/board/viewtopic.php?t=3412&start=15\"> here.</a> |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
FalconFly, Did these Signal 11 crashes occur either 1. near a benchmark or 2. as the climateprediction WU was prempted for another project? If so, do you have \"Leave application in memory when pre-empted\" set to Yes or No? If no, try yes and see if that helps. |
Send message Joined: 30 Aug 04 Posts: 77 Credit: 1,785,934 RAC: 0 |
I think all happened while being preempted. Due to bad experiences in the past (Projects continuing to run despite being paused) with the \"Keep Applications in Memory\" setting, it is currently disabled. I\'ll go ahead and enable it though for testing, maybe it works now. (lost another two models overnight on different systems again) Scientific Network : 44800 MHz - 77824 MB - 1970 GB |
Send message Joined: 30 Aug 04 Posts: 77 Credit: 1,785,934 RAC: 0 |
It really did the Trick, haven\'t had any such errors anymore after changing that setting :) Scientific Network : 44800 MHz - 77824 MB - 1970 GB |
©2024 cpdn.org