climateprediction.net (CPDN) home page
Thread 'RAC too low?'

Thread 'RAC too low?'

Message boards : Number crunching : RAC too low?
Message board moderation

To post messages, you must log in.

AuthorMessage
SmilingMoon

Send message
Joined: 29 Nov 05
Posts: 5
Credit: 6,359,893
RAC: 0
Message 31365 - Posted: 14 Nov 2007, 13:24:23 UTC
Last modified: 14 Nov 2007, 13:48:53 UTC

Hi to all experts out there!

I have two identical machines runnig cpdn, but today i realized that computer_id 385269 gets just approx. half the credit compared to computer_id 565448. I know for sure, there must be something wrong (2 months ago, when a last checked, both machines gave roughly the same amount of credit per dey) but i don\'t want to abort wu\'s unnecssearily. I wonder if someone could check this and give me an idea if there is reason for concern!?

Any help mightily appreciated,

SmilingMoon

PS: It seems my Computers are hidden, what can i do show them?

ID: 31365 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 31366 - Posted: 14 Nov 2007, 13:52:46 UTC
Last modified: 14 Nov 2007, 14:01:15 UTC

Hi,

I\'ll address your second question since without seeing the computers we can\'t help with the first:

* Taking part in CPDN (blue menu bar on this website) / Your account / Climateprediction.net preferences / Edit

Then change the setting \"Should climateprediction.net show your computers on its web site?\"

Once that\'s done we should be able to look at the computers (there may be a delay of an hour or so first). I can see one of them via the hostID you supplied, but not the other - are both the host IDs correct?

Things which can cause an unexpected slowdown:

* Overheating (thermal throttling)
* Something else running on the PC and taking CPU time
* Insufficient memory
* A model rewind (the model goes back a bit to retry a month or a year or whatever if an error was found). Running a stress-test such as Prime95\'s torture test for 24 hours will indicate if there is a problem on the PC causing rewinds.
* Frequent model crashes
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 31366 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 31370 - Posted: 14 Nov 2007, 15:51:28 UTC

Were you by chance running the model graphics as a screensaver on one computer but not the other? It really slows the crunching down so most of us disable it & use the boinc manager View graphics button instead.
Cpdn news
ID: 31370 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 31371 - Posted: 14 Nov 2007, 16:13:16 UTC


If you look at the boinc manager on the slower PC, are there two climate models running, and if so, is the following task showing as \'running\'?

hadsm3fub_0574_005897557_9

The last recorded trickle for this was in mid-September. My guess (and it is only a guess) is that it rewound to the start.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 31371 · Report as offensive     Reply Quote
SmilingMoon

Send message
Joined: 29 Nov 05
Posts: 5
Credit: 6,359,893
RAC: 0
Message 31378 - Posted: 15 Nov 2007, 6:35:18 UTC

First I want to say thank you for your helpful answers! Indeed I had edited my preferences concerning the visibility of my computers before but didn\'t know about the delay for the new settings to become effective.

mo.v: No, I\'m not using the screensaver.

MikeMarsUK: Yes both models are showing as \"running\". I have no reason to believe, there is anything wrong with my hardware, even though I seriously consider running your mentioned Prime95s torture test, just to be sure. To me it seems to be more likely, that hadsm3fub_0574_005897557_9 has gone astray. I wonder if it makes sense to continue crunching that particular model? Following your statement that the last recorded trickle was in September (where did you get that information btw?) maybe i should abort it?
ID: 31378 · Report as offensive     Reply Quote
ProfileConan
Avatar

Send message
Joined: 6 Jul 06
Posts: 147
Credit: 3,615,496
RAC: 420
Message 31379 - Posted: 15 Nov 2007, 6:44:06 UTC

They are probably not running the same model types.
The short ones give 94.52 cobblestones per trickle (on my 2.6GHz Opteron that happens every 6 hours or so and will complete in less than 4 months).
The non optimised long ones give 259.20 cobblestones per trickle (every 13 hours for me).
The optimised long ones give 310.80 cobblestones per trickle (about 13 or a bit more hores each trickle).

So if one has a long model and the other computer has a short model then overall the one with the longer model will get a higher credit output.

A short model will give 94.52 x 2 = 189.04 cobblestones in 12 hours.
A long model will give either 259.20 or 310.80 cobblestones in 13 hours.

So this could be the reason the RAC has dropped on one computer compared to the other. Plus I noticed that you have had a couple of models crash as well, this will affect RAC.

ID: 31379 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 31383 - Posted: 15 Nov 2007, 7:57:14 UTC
Last modified: 15 Nov 2007, 7:58:51 UTC

Following your statement that the last recorded trickle was in September (where did you get that information btw?)


It\'s on the model\'s page, linked to from your Account page.
There\'s LOTs of info there, which is why it\'s helpful to be able to see it, in order to provide help.

We can get there by clicking on your name to the left of your posts. This is the \"publicly viewable\" info.

\"You\" can get there the same way, or you can see the extra \"private data\" by going to your Account page in any of several ways:
There\'s a link at the left side of your BOINC manager, and there\'s also the Account option in the blue menu to the left of here.

You can then either click on computers to see the info about them, and then click on Results to get to a page of all your models, or go directly to the Results page from your main Accounts page.


Backups: Here
ID: 31383 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 31386 - Posted: 15 Nov 2007, 8:25:55 UTC
Last modified: 15 Nov 2007, 8:27:41 UTC


When you drill down as Les describes, you\'ll eventually get to this page for the climate model I highlighted:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6728892

If this is the one which is still \'running\', then yes, I\'d abort it.

Given the amount of time it has been running since the last trickle, and that it\'s a phase-3 slab model, I suspect that it has not gone back to the beginning, but turned into an iceworld instead (A rewound model would probably have been automatically aborted by this time).

When a slab model turns into an iceworld, it will continue running forward, but in some cases starts going incredibly slowly. The first symptom of either a rewound model or a slow iceworld is that you stop receiving \'trickles\'.

That PC has been successfully running climate models for a long time, so this bad model is probably a one-off. But if it happens again on the same PC, then I\'d run the torture test (I run it on my PCs just to check they\'re OK, around once per year).
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 31386 · Report as offensive     Reply Quote
SmilingMoon

Send message
Joined: 29 Nov 05
Posts: 5
Credit: 6,359,893
RAC: 0
Message 31390 - Posted: 15 Nov 2007, 11:44:40 UTC

Les Bayliss: Thanks for your helpful advice. I usually don\'t have sufficient time to delve deeper into the details of the projects I\'m participating so when I run into a serious problem (seldom enough though) I totally depend on help from the community. Now that worked perfectly again and with your information I will be capable of detecting possible causes for problems in the future better without bothering the real experts.

MikeMarsUK: I\'ve aborted the strange model and will check over the next few days, what happens to my overall rac. Incidentally your idea about a possible iceworld could be true, since the remainig time for the project actually went up 4 hours compared to yesterday.

Conan: Thanks for your reply either, its always nice to notice when somebody is interested in someone else\'s trouble.
ID: 31390 · Report as offensive     Reply Quote

Message boards : Number crunching : RAC too low?

©2024 cpdn.org