climateprediction.net (CPDN) home page
Thread 'CM3 Errors At Start'

Thread 'CM3 Errors At Start'

Questions and Answers : Macintosh : CM3 Errors At Start
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35234 - Posted: 14 Oct 2008, 0:00:02 UTC

I can no longer run CM3 Models. They error out immediatly with these messages:
Sun Oct 12 17:05:15 2008|climateprediction.net|Starting hadcm3ivolc_l1i8_2000_80_16001512_3
Sun Oct 12 17:05:15 2008|climateprediction.net|Starting task hadcm3ivolc_l1i8_2000_80_16001512_3 using hadcm3i version 602
Sun Oct 12 17:06:06 2008|climateprediction.net|Computation for task hadcm3ivolc_l1i8_2000_80_16001512_3 finished
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_1.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_2.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_3.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_4.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_5.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_6.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_7.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent
Sun Oct 12 17:06:06 2008|climateprediction.net|Output file hadcm3ivolc_l1i8_2000_80_16001512_3_8.zip for task hadcm3ivolc_l1i8_2000_80_16001512_3 absent

Previously had no problems, Tried restarting Boinc manager,Reset CPDN Project, Rolled back to Boinc 5.10.45. All with no luck
SM3 Models run fine. Any Suggestions ?????

P.S. There is a corresponding .zip file present in the data directory, just without _1 - _7
ID: 35234 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 35235 - Posted: 14 Oct 2008, 7:56:02 UTC - in response to Message 35234.  
Last modified: 14 Oct 2008, 8:24:15 UTC

Previously had no problems, Tried restarting Boinc manager,Reset CPDN Project, Rolled back to Boinc 5.10.45. All with no luck
SM3 Models run fine. Any Suggestions ?????

Your recent tasks are all generating Insufficient Memory/Stack Space Available! errors (e.g. 8139263, click on the \'+\' by stderr out). Instructions to fix the Mac shared memory problem are available here. It\'s possible that the problem is with shmseg (maximum number of shared memory segments per user) as v6 applications require an additional shared memory segment for the graphics (which is now run as a separate a program).

You might like to warn your team mates as some of them seem to have hit the same problem.
P.S. There is a corresponding .zip file present in the data directory, just without _1 - _7

That\'s the model parameters file downloaded from the server.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 35235 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35237 - Posted: 14 Oct 2008, 14:16:45 UTC

I did those changes a long time ago to get CPDN to run.
(i just tried increasing some more)

Last login: Tue Oct 14 06:58:12 on console
BdyGlv-Pro8:~ Ed$ sysctl -A | grep shm
kern.sysv.shmall: 4096
kern.sysv.shmseg: 64 <---- I doubled this one
kern.sysv.shmmni: 128
kern.sysv.shmmin: 1
kern.sysv.shmmax: 16777216

Tried a new CM3, and it died the same way.

<core_client_version>6.2.18</core_client_version>
<![CDATA[
<message>
process exited with code 22 (0x16, -234)
</message>
<stderr_txt>
Insufficient Memory/Stack Space Available!
called boinc_finish
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=506, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/jobs/yafbg.ihist
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/jobs/yafbg.namelists
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/dataout/atmos_restart.day
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/dataout/ocean_restart.day
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/jobs/atmos_dump.start
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/jobs/ocean_dump.start
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/jobs/specsw
cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3ivolc_l2f5_2000_80_16001739/jobs/speclw
Insufficient Memory/Stack Space Available!
called boinc_finish

Here\'s my system info also

Hardware Overview:
Model Name: Mac Pro
Model Identifier: MacPro3,1
Processor Name: Quad-Core Intel Xeon
Processor Speed: 2.8 GHz
Number Of Processors: 2
Total Number Of Cores: 8
L2 Cache (per processor): 12 MB
Memory: 4 GB
Bus Speed: 1.6 GHz
Boot ROM Version: MP31.006C.B05
SMC Version: 1.25f4
Serial Number: G88081B6XYK
OSX 10.5.5

Please, someone help !!




ID: 35237 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 35238 - Posted: 14 Oct 2008, 16:24:29 UTC

I\'m not a Mac guy, but in the example on the help webpage that Thyme linked to, he quadrupled the amount of shared memory segments. To do that he multiplied everything that was in default, except the kern.sysv.shmmin, by 4. So if you double one of the parameters, maybe it would be good to try to double all the others as well, except kern.sysv.shmmin?

Just a shot in the dark.
ID: 35238 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 35239 - Posted: 14 Oct 2008, 20:39:46 UTC
Last modified: 14 Oct 2008, 21:12:33 UTC

Is there something like \"ulimit\" on Darwin?

If so, try to set \"ulimit unlimited\" before you start BOINC.


(maybe you could post the current limits too, i.e. before you\'ve set \"unlimited\")

p.s.: ulimit without options queries or sets only the disk file size, you need to specify \"-s\", \"-d\" or \"-l\" or so (memory related), depending on the options Darwin offers.

p.p.s.: I found the manual entry for \"ulimit\", it\'s a shell builtin (sh, not csh). So (for example) \"ulimit -a\" can show all limits.
ID: 35239 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35242 - Posted: 15 Oct 2008, 0:13:01 UTC

I\'m starting to think the shared memory segments aren\'t the problem.
I have the quadrupled settings which are more than adequate.
CSM3\'s used to run fine when they started like this \"hadcm3istd....\"
Now they start like this \"hadcm3ivolc...\".
These are different and failing. Nothing else on my end has changed.
If anyone is running these models successfully, please let me know what you did.


ID: 35242 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35243 - Posted: 15 Oct 2008, 0:23:00 UTC

I see over on the main boards (which I can\'t log into and post for some reason)
They are saying the same thing about memory segments. I have the same values
as the example and it still crashes. How high dare I up these settings ???
ID: 35243 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 35245 - Posted: 15 Oct 2008, 1:25:51 UTC
Last modified: 15 Oct 2008, 1:28:41 UTC

The discussion board requires a separate signup, it has no access to the BOINC signup data. The signup link is on top of the page there (\"Register\").

Have you tried this \"ulimit -a\" from bash or sh command line (using the user ID that runs BOINC)? What does it say?


While a solution hasn\'t been found yet, you could choose a different model type here, just uncheck the checkbox near \"Application UK Met Office HADCM3\"
ID: 35245 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35249 - Posted: 15 Oct 2008, 11:04:34 UTC

Here\'s my current Ulimit results.

BdyGlv-Pro8:~ Ed$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) 6144
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 256
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 266
virtual memory (kbytes, -v) unlimited


ID: 35249 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 35251 - Posted: 15 Oct 2008, 12:03:48 UTC - in response to Message 35249.  
Last modified: 15 Oct 2008, 12:04:40 UTC

Here\'s my current Ulimit results.

...
data seg size (kbytes, -d) 6144
...
stack size (kbytes, -s) 8192
...


I\'m not familiar with Mac and I have no idea how much of which ressource those volcanic models need - but from the error message, the next thing to try would be

ulimit -s 16384
and/or
ulimit -d 16384

in the startup settings (.profile ?) of the user that runs the BOINC client

(sorry, it\'s all a bit experimental as long as no Mac people jump in)
ID: 35251 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35305 - Posted: 18 Oct 2008, 14:26:04 UTC

I downloaded to many SM3 wu\'s to get more work right now.
Still waiting to try the Ulimit changes.

The register process is broken on the other boards, and the
moderators won\'t respond to my requests for help.

Can someone try getting this thread going over there to try
and stir up some other mac users, or project people?
The stated fix for this problem isn\'t working. The app is clearly
broken and no one seems to care..... Thanks
ID: 35305 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35306 - Posted: 18 Oct 2008, 15:31:46 UTC
Last modified: 18 Oct 2008, 15:32:35 UTC

Hi BdyGlv

I (and I expect several other CPDN moderators) have been following this thread. I haven\'t responded and tried to help because I know nothing about Macs. Unfortunately I don\'t think any of the CPDN mods are Mac specialists.

You must have reached your daily model download quota which means you\'ll have to wait until midnight (your time, I think) to get a new quota. See what happens when you try Ananas\'s suggestion about the ulimit.

Which other forum are you referring to? I don\'t think you mean the CPDN independent forum. Even if you\'re already registered on it, don\'t start a new thread there because the same people would probably be available to help here as there. It\'s best to keep the whole discussion in one place.

Did you mean you couldn\'t register on the Boinc_dev forum?
Cpdn news
ID: 35306 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35308 - Posted: 18 Oct 2008, 16:11:02 UTC
Last modified: 18 Oct 2008, 16:14:00 UTC

Just a suggestion so that if the ulimit tactic isn\'t successful you don\'t use up your entire task quota for Sunday at once in an orgy of failures.

You must now have no running CPDN models. Your daily quota is one task per core and your computer has 8 cores. You could temporarily limit the number of new model downloads to 2. That should be enough to see whether the ulimit idea works. If it doesn\'t work you\'d still be able to try 6 more model downloads on Sunday.

Boinc Manager > Advanced menu > Preferences > Processor usage tab > On multiprocessor systems use at most > edit the number (which probably says 7 or 8) to 2 > Click OK > Close that window.

If Ananas thinks I\'ve made a bad suggestion I\'m sure he\'ll tell us!
Cpdn news
ID: 35308 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35309 - Posted: 18 Oct 2008, 16:26:42 UTC

This is the error message I get when trying to register
at the CPDN independent forum. I\'ve tried a couple of my e-mail address,
and even tried using internet explorer on a pc at work in case it didn\'t like Safari.
(Mac\'s browser)

Could not find email template file :: textual_confirmation

DEBUG MODE

Line : 111
File : emailer.php


Also I don\'t know how to make the Ulimit change permenant.
When i to it from a terminal window, it reverts back to default
when i close the shell. Unix is not my strong suit. Someone else better try this one.
Thanks again .... Ed



ID: 35309 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35311 - Posted: 18 Oct 2008, 17:03:11 UTC

Try registering again on the independent forum. I\'ve just registered there using a different name and it worked. I created a second new account using Safari and it also worked. You won\'t see my new accounts in the memberlist though because I\'ve cancelled them.
Cpdn news
ID: 35311 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35312 - Posted: 18 Oct 2008, 18:55:51 UTC

I\'m still getting that same error message.
I\'ve used my yahoo and comcast email address.
I can\'t figure out what i\'m doing wrong.
It\'s just not my day....
ID: 35312 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 35314 - Posted: 18 Oct 2008, 20:58:44 UTC

Make sure you give the correct answer to the question under the confirmation code. I just tested and got the same error when I gave the wrong answer or left the box empty.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 35314 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 35318 - Posted: 19 Oct 2008, 2:47:00 UTC - in response to Message 35309.  
Last modified: 19 Oct 2008, 2:57:14 UTC

... Also I don\'t know how to make the Ulimit change permenant.
When i to it from a terminal window, it reverts back to default
when i close the shell. ...


If it works similar to Unix, there should be a file with the name \".profile\" (without the quotes, but including the starting period) in the home directory of the user. This file usually contains environment and shell settings that kick in everytime you start a shell.

The leading period might make that file invisible in your GUI but you can see it with the \"a\" option of the \"ls\" command (like \"ls -la\")

But it might work different on a Mac.


There are Mac-specific BOINC teams btw., one that comes to my mind would be MacNN, maybe it would be a good idea to join their forum. I guess they will do better than us \"mac-illiterate\" moderators ;-)

edit : I found several more Mac BOINC groups using BOINCstats\' team search
ID: 35318 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 35319 - Posted: 19 Oct 2008, 3:09:46 UTC
Last modified: 19 Oct 2008, 3:24:18 UTC

Here is one more host that shows the same error message, it is now running 4 HadSM3 models, obviously the problem is specific for those HadCM3volc models.


p.s.: I did some more searching and found much higher tweaking values for kern.sysv.shmall, some even recommend 65536 there


One more idea (just in case ...) - did you reboot your machine after changing those kernel parameters? It requires a reboot before the new values can take effect.
ID: 35319 · Report as offensive     Reply Quote
old_user183856

Send message
Joined: 22 Apr 06
Posts: 13
Credit: 1,033,659
RAC: 0
Message 35355 - Posted: 23 Oct 2008, 23:52:15 UTC
Last modified: 23 Oct 2008, 23:56:20 UTC

Finally got another CM3 to try out, and it still crashes.
Yes did a reboot after editing sysctl, here\'s my current values.
kern.sysv.shmall: 65536
kern.sysv.shmseg: 32
kern.sysv.shmmni: 128
kern.sysv.shmmin: 1
kern.sysv.shmmax: 16777216

Also here\'s my Boinc startup messages
Thu Oct 23 16:13:38 2008||Starting BOINC client version 6.2.18 for x86_64-apple-darwin
Thu Oct 23 16:13:38 2008||log flags: task, file_xfer, sched_ops
Thu Oct 23 16:13:39 2008||Libraries: libcurl/7.18.0 OpenSSL/0.9.7l zlib/1.2.3 c-ares/1.5.1
Thu Oct 23 16:13:39 2008||Data directory: /Library/Application Support/BOINC Data
Thu Oct 23 16:13:39 2008||Processor: 8 GenuineIntel Intel(R) Xeon(R) CPU E5462 @ 2.80GHz [x86 Family 6 Model 23 Stepping 6]
Thu Oct 23 16:13:39 2008||Processor features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM SSE3 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1
Thu Oct 23 16:13:39 2008||OS: Darwin: 9.5.0
Thu Oct 23 16:13:39 2008||Memory: 4.00 GB physical, 178.93 GB virtual
Thu Oct 23 16:13:39 2008||Disk: 265.88 GB total, 178.69 GB free
Thu Oct 23 16:13:39 2008||Local time is UTC -7 hours
Thu Oct 23 16:13:39 2008||No coprocessors
Thu Oct 23 16:13:40 2008|CPDN Seasonal Attribution Project|URL: http://attribution.cpdn.org/; Computer ID: 30626; location: (none); project prefs: default
Thu Oct 23 16:13:40 2008|rosetta@home|URL: http://boinc.bakerlab.org/rosetta/; Computer ID: 761509; location: home; project prefs: default
Thu Oct 23 16:13:40 2008|climateprediction.net|URL: http://climateprediction.net/; Computer ID: 847241; location: home; project prefs: default
Thu Oct 23 16:13:40 2008|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 1131179; location: home; project prefs: default
Thu Oct 23 16:13:40 2008|Milkyway@home|URL: http://milkyway.cs.rpi.edu/milkyway/; Computer ID: 8516; location: home; project prefs: default
Thu Oct 23 16:13:40 2008|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 4249000; location: home; project prefs: default
Thu Oct 23 16:13:40 2008|Artificial Intelligence System|URL: http://www.intelligencerealm.com/aisystem/; Computer ID: 9406; location: (none); project prefs: default
Thu Oct 23 16:13:40 2008||General prefs: from http://bam.boincstats.com/ (last modified 18-Oct-2008 15:07:10)
Thu Oct 23 16:13:40 2008||Computer location: home
Thu Oct 23 16:13:40 2008||General prefs: no separate prefs for home; using your defaults
Thu Oct 23 16:13:40 2008||Reading preferences override file
Thu Oct 23 16:13:40 2008||Preferences limit memory usage when active to 2048.00MB
Thu Oct 23 16:13:40 2008||Preferences limit memory usage when idle to 3276.80MB
Thu Oct 23 16:13:40 2008||Preferences limit disk usage to 9.31GB
ID: 35355 · Report as offensive     Reply Quote
1 · 2 · Next

Questions and Answers : Macintosh : CM3 Errors At Start

©2024 cpdn.org