climateprediction.net (CPDN) home page
Thread 'process got signal 11, whats that?'

Thread 'process got signal 11, whats that?'

Questions and Answers : Unix/Linux : process got signal 11, whats that?
Message board moderation

To post messages, you must log in.

AuthorMessage
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 12073 - Posted: 24 Apr 2005, 19:06:01 UTC
Last modified: 24 Apr 2005, 19:11:35 UTC

Model crashed with this outprint, anyone knows what it means?
It's the third time running from backups:

2527_200120886 - PH 2 TS 259247 - 01/12/1840 23:30 - H:M:S=0308:54:01 AVG= 2.14 DLT= 1.00
2527_200120886 - PH 2 TS 259248 - 02/12/1840 00:00 - H:M:S=0308:54:02 AVG= 2.14 DLT= 1.00
Phase over, going into post_processing()
In pre_initialise_phase (part 1 of 3)
In initialise_phase (part 2 of 3)
Calculating global means for files .pa|.x2|.nc
Calculating regional means for .pa|.x2|.nc
Calculating global means for files .pd|.x2|.nc
Calculating regional means for .pd|.x2|.nc
Calculating global means for files .pe|.x2|.nc
Calculating regional means for .pe|.x2|.nc
2005-04-24 21:09:55 [climateprediction.net] Unrecoverable error for result 2527_200120886_1 (process got signal 11)
2005-04-24 21:09:55 [climateprediction.net] Unrecoverable error for result 2527_200120886_1 (process got signal 11)
2005-04-24 21:09:55 [climateprediction.net] Deferring communication with project for 1 days, 4 hours, 19 minutes, and 30 seconds
2005-04-24 21:09:55 [climateprediction.net] Deferring communication with project for 1 days, 4 hours, 19 minutes, and 30 seconds
2005-04-24 21:09:55 [climateprediction.net] Computation for result 2527_200120886 finished

(slab 4.04 boinc 4.13)
ID: 12073 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 12076 - Posted: 24 Apr 2005, 22:39:38 UTC
Last modified: 24 Apr 2005, 23:03:12 UTC

sigsegv - segmentation violation - you can figure out the meaning of such a signal with the command

kill -l

which will list the shortcuts. Usually the sigsegv is a program problem, not your PCs fault.


But it's one of those I had on one machine everytime when it tried to create the upload data for trickle 24 on one computer. Mine happened on Win2k but basically it's the same type of errors :-/

If yours created a file called "core" on the crash, that would be a neat chance to track down the bug. The coredump contains all kinds of process informations up to source code level about the crashed task. (... if it has the symbol informations not stripped that is)
ID: 12076 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 12084 - Posted: 25 Apr 2005, 7:10:11 UTC

Hi Ananas:-)

Yeah I have a dumped core, it's 511.8 MB big.
This box is an A64 with Fedora C3 64-bit. It has problems in Seti Classic too.
Maybe I should try another kernel, it's 2.6.10-1.770-FC3 now.

I can show what stderr_um.txt have:

forrtl: info: Fortran error message number is 63.
forrtl: warning: Could not open message catalog: ifcore_msg.cat.
forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/ifcore_msg.cat.
forrtl: info: Fortran error message number is 63.
forrtl: warning: Could not open message catalog: ifcore_msg.cat.
forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/ifcore_msg.cat.
OPEN: File dataout/2527ba.da40b70 Created on Unit 22
OPEN: File dataout/2527ba.da40ba0 Created on Unit 22
OPEN: File dataout/2527ba.da40bd0 Created on Unit 22
OPEN: File dataout/2527ba.da40bg0 Created on Unit 22
OPEN: File dataout/2527ba.da40bj0 Created on Unit 22
OPEN: File dataout/2527ba.da40bm0 Created on Unit 22
OPEN: File dataout/2527ba.da40bp0 Created on Unit 22
OPEN: File dataout/2527ba.da40bs0 Created on Unit 22
OPEN: File dataout/2527ba.da40c10 Created on Unit 22
CLOSE: WARNING: Unit 60 Not Opened
OPEN: File dataout/2527ba.pa41c10 Created on Unit 60
CLOSE: WARNING: Unit 63 Not Opened
OPEN: File dataout/2527ba.pd41c10 Created on Unit 63
CLOSE: WARNING: Unit 64 Not Opened
OPEN: File dataout/2527ba.pe41c10 Created on Unit 64
CLOSE: WARNING: Unit 65 Not Opened
OPEN: File dataout/2527ba.pf41c10 Created on Unit 65
CLOSE: WARNING: Unit 66 Not Opened
OPEN: File dataout/2527ba.pg41c10 Created on Unit 66
CLOSE: WARNING: Unit 67 Not Opened
OPEN: File dataout/2527ba.ph41c10 Created on Unit 67

Thanks a lot:)
/me have to rush now, back on Friday

ID: 12084 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 12163 - Posted: 30 Apr 2005, 11:58:31 UTC
Last modified: 30 Apr 2005, 12:11:03 UTC

I tried it for a fourth time with a severerly downclocked box.
Just the same probs.
I -reset_project.

It crashed on Alien huntings but that did two other boxen do too at the same time and they have never crashed the Linux 3.03 before, sooo...

Well..you know your alive :-)
ID: 12163 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : process got signal 11, whats that?

©2024 cpdn.org