climateprediction.net home page
Compute Errors of Full Model work Units

Compute Errors of Full Model work Units

Questions and Answers : Windows : Compute Errors of Full Model work Units
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user648268

Send message
Joined: 15 Feb 11
Posts: 6
Credit: 1,016,516
RAC: 0
Message 45008 - Posted: 3 Oct 2012, 15:25:39 UTC

Hi,

I'm hoping some kind person could let me know the reasons this otherwise rock solid WINXP SP3 PC is having issues with the Full Model.
The Machine is 1236291 = http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1236291

And the last two Full model errors are here :-
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15311234
+
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15272383.

Other models complete successfully.
Have I got a hard disk on the way out, a local setting not correct, or what ?
I don't like causing CPDN extra hassle so will pull this PC off the Full Models if it won't run them. But it normally runs everything BOINC can throw at it and was not running other BOINC apps during these sessions.
(Unfortunately this is a remote PC and I won't be able to get to it until next Monday 8th Oct).

Thank you for your time and help.
Rob
ID: 45008 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,728,292
RAC: 3,041
Message 45009 - Posted: 3 Oct 2012, 16:04:23 UTC

The two models actually tell different stories, though there is something in common.

The thing in common is that both stderr logs (click on the 'Stderr' plus sign to expand the text) show numerous lines such as:

Suspended CPDN Monitor - Suspend request from BOINC...

The frequency of these messages suggests that BOINC is running in its default configuration, in which the science application is suspended if the CPU use exceeds 25%. That setting is intended to make BOINC a good cyber-citizen by not interfering with your use of the PC. However, the longer HADCM3N models don't like being constantly suspended and are likely to crash, particularly at the decade upload points (25%, 50%, 75% and 100%). If you find the model intrusive, then just don't run the longer ones; if the model running in the background is not a problem then get rid of the suspension threshold. To do that, select 'Tools | Computing preferences' in BOINC Manager, then select the 'processor usage' tab, then set 'While processor usage is less than ...' to zero.

The thing that is different is that the second model reported:

Model crashed: ATM_DYN : INVALID THETA DETECTED.

This message is produced when the physics becomes unrealistic and does not (necessarily) mean that there is anything wrong with your PC. If, however, your PC is overclocked or failing then it may produce numerical errors of this sort. But start with the suspension threshold first.
ID: 45009 · Report as offensive     Reply Quote
old_user648268

Send message
Joined: 15 Feb 11
Posts: 6
Credit: 1,016,516
RAC: 0
Message 45010 - Posted: 3 Oct 2012, 17:25:28 UTC - in response to Message 45009.  

Excellent information Iain, thank you.

This is a standard install, with a standard PC config, no over-clocking or the like. The only change in the last month has been the AV program, which could be taking up PC resources at times. I'll look into the services currently running on the box. (This PC's only function is to display the current NOAA weather at Spokane,WA and Jacksonville,FL so its not exactly being stressed !)

I'll change the settings when I get back and see how we go.

Thanks again for the extremely quick reply.
ID: 45010 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 45012 - Posted: 3 Oct 2012, 22:51:33 UTC

Changing the suspension threshold is done in your Account page on the project server, so this can be done at any time, from any of your computers. No need to wait until you get to the affected computer.
De-selecting the long models can also be done this way.


Backups: Here
ID: 45012 · Report as offensive     Reply Quote
old_user648268

Send message
Joined: 15 Feb 11
Posts: 6
Credit: 1,016,516
RAC: 0
Message 45026 - Posted: 4 Oct 2012, 12:06:57 UTC - in response to Message 45012.  

Thanks Les,
We are limited to default, home,school & work for these profile preferences are we not ?
I'm one of these users that would need about 8 profiles to manage their PC's effectively. However I've managed to give it a unique profile and will now wait for it contact the CPDN sever(s).
Thanks for the help.
ID: 45026 · Report as offensive     Reply Quote

Questions and Answers : Windows : Compute Errors of Full Model work Units

©2024 cpdn.org