Message boards : Number crunching : Error while computing???
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,824,485 RAC: 4,956 |
Got segment violation errors on tasks from batches 777 and 780. Both appear to be after 9th zip file as zips from 10 onwards are not generated. I have finished multiple 777/780, but maybe there are some duff ones or there was some local difficulty - those batches look fine, as far as I know. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
After aborting the 781's I received three more of them. Hopefully that will be it. Sarah is closing this batch with the abort function now. At least one moderator has a couple still running and so will be able to let Sarah know that it has worked. |
Send message Joined: 13 Jul 18 Posts: 38 Credit: 62,933,508 RAC: 84,702 |
Some crunchers are unknowingly crashing their WUs by doing too many Suspends. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
I just got a 781 and aborted it. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
I just got a 781 and aborted it. I think the abort signal may have now gone out as the number of tasks in progress has just dropped by about 8,000 https://www.cpdn.org/cpdnboinc/server_status.php |
Send message Joined: 15 Jan 11 Posts: 175 Credit: 6,242,691 RAC: 699 |
Some crunchers are unknowingly crashing their WUs by doing too many Suspends. In my experience, suspending is not a problem if the option "Leave non-GPU tasks in memory while suspended" is set in the account preferences. I've never had a problem with that setting and I suspend one computer quite regularly and with this option set, the tasks also survive shutdown. I guess it's a case of how to get the message across. |
Send message Joined: 9 Dec 05 Posts: 116 Credit: 12,547,934 RAC: 2,738 |
I just got a 781 and aborted it. The number of different task types listed in progress were reduced from 9 to just 4. See the graphs: http://ob.cakebox.net/cpdn_status/server_status.html Maybe they just cleaned the database of obosolete applications. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes, that's something that was talked about last year. About time it happened. And I think that the link to that page needs renaming on the Main page, from Server status to Project status, seeing as how the servers haven't been listed there for several years. However, batch 781 is now listed as closed, so that should put a stop to those problems, provided everyone's computers contact the server to get the kill message. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,824,485 RAC: 4,956 |
My two running 781 models have uploaded Zips and not been killed, so I've now aborted them both ... |
Send message Joined: 28 Dec 17 Posts: 18 Credit: 1,097,261 RAC: 147 |
Hi all. What does this error message mean? This WU was functioning perfectly for a long time and suddenly ended with a computation error. "Signal 11 received: Segment violation Signal 11 received: Software termination signal from kill Signal 11 received: Abnormal termination triggered by abort call Signal 11 received, exiting... 15:24:03 (9176): called boinc_finish(193) Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8928, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6988, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_ain::Monitor... 15:25:09 (6988): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>wah2_safr50_a0ax_201412_16_779_011704417_1_r358526244_11.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> <file_xfer_error> <file_name>wah2_safr50_a0ax_201412_16_779_011704417_1_r358526244_12.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> <file_xfer_error> <file_name>wah2_safr50_a0ax_201412_16_779_011704417_1_r358526244_13.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> <file_xfer_error> <file_name>wah2_safr50_a0ax_201412_16_779_011704417_1_r358526244_14.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> <file_xfer_error> <file_name>wah2_safr50_a0ax_201412_16_779_011704417_1_r358526244_15.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> <file_xfer_error> <file_name>wah2_safr50_a0ax_201412_16_779_011704417_1_r358526244_16.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> <file_xfer_error> <file_name>wah2_safr50_a0ax_201412_16_779_011704417_1_r358526244_restart.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]>" |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
seg faults are a problem with the task rather than your computer. |
Send message Joined: 28 Dec 17 Posts: 18 Credit: 1,097,261 RAC: 147 |
Thanks for the quick reply. Glad to hear the computer seems to be functioning properly in BOINC. |
©2024 cpdn.org