Questions and Answers : Macintosh : misconfigured BOINC crashing work units
Message board moderation
Author | Message |
---|---|
Send message Joined: 20 Jun 05 Posts: 1 Credit: 2,463,117 RAC: 0 |
just recieved this e mail and do not understand. Way out of my league. Help? mike climateprediction.net notification: Dear mike armstrong Your machine (host # 878606) described below appears to have a misconfigured BOINC installation resulting in it crashing workunits. Would you please have a look at it? Sincerely, The climateprediction.net team This is the content of our database: ID: 878606 Created: 13 Jun 2008 8:29:05 UTC Venue: home Total credit: 111352.320967913 Average credit: 797.616802835029 Average update time: 14 Nov 2008 7:53:45 UTC IP address: 192.168.1.100 (same the last 395 times) Domain name: mike-armstrongs-imac.local Local Time = UTC +0 hours Number of CPUs: 2 CPU: GenuineIntel Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz [x86 Family 6 Model 15 Stepping 11] FP ops/sec: 2055598762.29746 Int ops/sec: 5435676450.93856 memory bandwidth: 1000000000 Operating System: Darwin 9.5.0 Memory: 1024 MB Cache: 976.56 KB Swap Space: 71421.82 MB Total Disk Space: 200.88 GB Free Disk Space: 69.5 GB Avg network bandwidth (upstream): 20872.354709 bytes/sec Avg network bandwidth (downstream): 95866.167839 bytes/sec Average turnaround: 0 days Number of RPCs: 975 Last RPC: 14 Nov 2008 0:32:49 UTC % of time client on: 95.725 % % of time host connected: -100 % % of time user active: 99.9233 % # of results today: 2 For further information and assistance with climateprediction.net go to |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
If you look at the list of tasks for your Mac you\'ll see that 37 tasks have crashed immediately since 27th October. Click on any of the task ID links and then click on the \'+\' button after stderr out and you\'ll see that they\'ve all failed with the error Insufficient Memory/Stack Space Available! That\'s the error discussed here. The problem seems to be restricted to HADCM3 version 6.* tasks on Mac, so your best option until the cause is identified would be to change your project preferences to avoid HADCM3 models. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 29 Sep 08 Posts: 5 Credit: 4,330,352 RAC: 0 |
I got this email today as well, however all my work units appear to be fine, scanning through the stderr_um files, there all 0 bytes. I\'m not sure what to do? Here\'s my clientID http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=924187 |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
I got this email today as well, however all my work units appear to be fine, scanning through the stderr_um files, there all 0 bytes. 16 models downloaded to an 8 cpu computer does not sound too many especially when 8 of the models have credit granted. The email was supposed to have been sent to 80 troublesome hosts but my first reaction is that you shouldn\'t have been sent the email in repect of that computer. But it could easily be me misunderstanding. I\'ll try and find out. Visit BOINC WIKI for help And join BOINC Synergy for all the news in one place. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Toby I\'ve looked at what your three computers are doing and they seem to be fine. I\'d just set the 8-core machine to No new tasks for CPDN in the Projects tab of Boinc Manager. It has 16 models which will keep it busy for a while! Another moderator and I think you\'ve received this email in error. (Mike who posted above did need the advice given.) We\'ve asked an administrator to check how the server selects members for this email. Sorry about that - it does seem to be a mistake on the part of the project. Cpdn news |
Send message Joined: 29 Sep 08 Posts: 5 Credit: 4,330,352 RAC: 0 |
Hi Toby I did reset the project because I was getting: 08-Nov-2008 19:21:42 [climateprediction.net] Task hadsm3mh_kl2v_006005487_7 exited with zero status but no \'finished\' file 08-Nov-2008 19:21:42 [climateprediction.net] If this happens repeatedly you may need to reset the project. So the 1st 8 tasks won\'t ever run, can you re-assign them from your end? Thanks for your help and no problem about the mistake, were all human. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
OK, that explains why you apparently have 16 tasks running. When the server updates that computer\'s tasks page, 8 of the models will be shown as aborted. I\'d come back to ask you how you\'d managed to get 16 models! You can see all this info for yourself by clicking on your name here on the forum then going through the links. The zero status message is usually benign. If you get it again, don\'t reset the project please. That causes you to lose every model you\'re running. To tell the truth, this message from BOINC is a terrible nuisance. I think it should instead advise members who get the message repeatedly to ask for advice on the project forum. If you go to the CPDN READMEs linked in my signature and in the collection about Crashes and Problems look at item #6 by MikeMars, you\'ll find info about this and all the other common things that can go wrong, and what we need to do to keep these long models going to the end. The criterion for the email turns out to be more than 10 model downloads in a week. Anyway, if the email leads you to the project READMEs it will have been worthwhile. Cpdn news |
Send message Joined: 29 Sep 08 Posts: 5 Credit: 4,330,352 RAC: 0 |
snip It got 8, then I had the nuisance message, since that computer is brand new, I though it was legit, so I did the reset. Then I got another 8, I assume that\'s the default for an 8core machine. I agree with you it implies that you have a serious problem, when it\'s probally noting to worry about. I have read the readme\'s, so probally some added value :) I suppose >10 in a week is a good indication of problems for 99% of people, not many people have 8core machine at home. Thanks for all your help, I know where to look before being rash next time. |
©2024 cpdn.org