Message boards : Number crunching : hadcm3n failed at 1%
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Aug 12 Posts: 3 Credit: 0 RAC: 0 |
Is hadcm3n compatible with a Mac mini with OSX 10.5.8 and intel core 2 duo? I can't find any error message but I think the task I was running had a computation error at ~1%. The only trace I can find is in /Library/Application Support/BOINC Data/stdoutdae.txt: 18-Sep-2013 08:23:31 [climateprediction.net] Restarting task hadcm3n_83eu_1980_40_008462361_1 using hadcm3n version 607 in slot 0 18-Sep-2013 08:23:35 [climateprediction.net] Computation for task hadcm3n_83eu_1980_40_008462361_1 finished 18-Sep-2013 08:23:35 [climateprediction.net] Output file hadcm3n_83eu_1980_40_008462361_1_1.zip for task hadcm3n_83eu_1980_40_008462361_1 absent 18-Sep-2013 08:23:35 [climateprediction.net] Output file hadcm3n_83eu_1980_40_008462361_1_2.zip for task hadcm3n_83eu_1980_40_008462361_1 absent 18-Sep-2013 08:23:35 [climateprediction.net] Output file hadcm3n_83eu_1980_40_008462361_1_3.zip for task hadcm3n_83eu_1980_40_008462361_1 absent 18-Sep-2013 08:23:35 [climateprediction.net] Output file hadcm3n_83eu_1980_40_008462361_1_4.zip for task hadcm3n_83eu_1980_40_008462361_1 absent Is there a way that I can find out what happened to the process? This is not the first time a task from climateprediction.net has failed on this computer, and they have always failed before the first trickle. PrimeGrid and Constellation work fine for me. Asteroids fails immediately. Any educated guesses as to whether the problem is with the task, or with my computer? Should I give up running Climateprediction.net tasks on this computer? Thanks for your help. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Jane, Your computers are hidden so we can't look at the tasks page, and the stderr listing on it. If you could link to that task/result or the computer, we could look at it in more depth. |
Send message Joined: 18 Aug 12 Posts: 3 Credit: 0 RAC: 0 |
Jane, Sorry about that. This a link to the tasks on this computer: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1288963 Jane |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
Hi Jane. Code 193 is a bit of a catch-all error. If you have upgraded BOINC from an older version, the upgrade may have caused a permissions problem. Probably not that, though. The problem may be caused by not having selected the processing option "Leave tasks in memory while suspended". (BOINC's default is unselected, but that doesn't work well for CPDN.) It's also best to set the limit of CPU use by other programs quite high, too -- i.e. to not suspend BOINC too frequently. (The work runs at low priority and Macs are good at prioritising work, so leaving BOINC to run mostly has no effect on other work. The exceptions are recording sound or editing movies, and some games.) Both of those options are in the "Computing preferences..." menu option, available from Boinc Manager's "advanced" view. It's also best to exclude the BOINC folder from backups, as the CPDN programs are "touchy" about other programs trying to access their files. The Mac section of this Board may be a source of other things to try if those don't fix the problem, and it has a post detailing how to fix the permission problem. |
Send message Joined: 18 Aug 12 Posts: 3 Credit: 0 RAC: 0 |
Greg, Thank you for the advice. I did have "Leave tasks in memory while suspended?" checked, but I will raise "Suspend work if CPU usage is above" to 75%. My computer tends to get hot, so I have set my processor usage to run all the time, but use at most 10% of the CPU. Do you think that may have contributed to this failure? Jane |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
I don't think that limiting CPU use to 10% could affect the success of your climate models. If people join CPDN through the Weather@home website or through Progress Thru Processors, their CPU usage will be set by default to 60%. That's in case people are running the project on laptops and don't realise that they need to take action to avoid overheating. So limiting CPU usage is frequently used and BOINC is designed for this. When you shut down your computer do you first suspend your tasks in BOINC Manager and then exit completely from BOINC? You can exit by right-clicking on the BOINC icon in the system tray, then selecting Exit. Not exiting from BOINC before computer shutdown will sooner or later cause the occasional model to crash. Cpdn news |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
As Mo says, BOINC is designed to allow frequent suspension of work. I'm not so sure about all of CPDN's models, though. In my experience---this is of course anecdotal---Weather At Home models, and the now retired FAMOUS models are quite robust to frequent suspension; but the older HadCM3Ps, and now HadCM3Ns: not so much. It does seem to vary a lot between machines, though. But HadCM3N seems to have the most trouble with disk contention: its files in use by other software when HadCM3N wants to write to them. Antivirus or backup software, usually. In the case of Macs, I guess that means Time Machine. And I second Mo's advice about exiting from BOINC before shutting down the computer. |
©2024 cpdn.org