Message boards : Number crunching : RAC too low?
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Nov 05 Posts: 5 Credit: 6,359,893 RAC: 0 |
Hi to all experts out there! I have two identical machines runnig cpdn, but today i realized that computer_id 385269 gets just approx. half the credit compared to computer_id 565448. I know for sure, there must be something wrong (2 months ago, when a last checked, both machines gave roughly the same amount of credit per dey) but i don\'t want to abort wu\'s unnecssearily. I wonder if someone could check this and give me an idea if there is reason for concern!? Any help mightily appreciated, SmilingMoon PS: It seems my Computers are hidden, what can i do show them? |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Hi, I\'ll address your second question since without seeing the computers we can\'t help with the first: * Taking part in CPDN (blue menu bar on this website) / Your account / Climateprediction.net preferences / Edit Then change the setting \"Should climateprediction.net show your computers on its web site?\" Once that\'s done we should be able to look at the computers (there may be a delay of an hour or so first). I can see one of them via the hostID you supplied, but not the other - are both the host IDs correct? Things which can cause an unexpected slowdown: * Overheating (thermal throttling) * Something else running on the PC and taking CPU time * Insufficient memory * A model rewind (the model goes back a bit to retry a month or a year or whatever if an error was found). Running a stress-test such as Prime95\'s torture test for 24 hours will indicate if there is a problem on the PC causing rewinds. * Frequent model crashes I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Were you by chance running the model graphics as a screensaver on one computer but not the other? It really slows the crunching down so most of us disable it & use the boinc manager View graphics button instead. Cpdn news |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
If you look at the boinc manager on the slower PC, are there two climate models running, and if so, is the following task showing as \'running\'? hadsm3fub_0574_005897557_9 The last recorded trickle for this was in mid-September. My guess (and it is only a guess) is that it rewound to the start. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 29 Nov 05 Posts: 5 Credit: 6,359,893 RAC: 0 |
First I want to say thank you for your helpful answers! Indeed I had edited my preferences concerning the visibility of my computers before but didn\'t know about the delay for the new settings to become effective. mo.v: No, I\'m not using the screensaver. MikeMarsUK: Yes both models are showing as \"running\". I have no reason to believe, there is anything wrong with my hardware, even though I seriously consider running your mentioned Prime95s torture test, just to be sure. To me it seems to be more likely, that hadsm3fub_0574_005897557_9 has gone astray. I wonder if it makes sense to continue crunching that particular model? Following your statement that the last recorded trickle was in September (where did you get that information btw?) maybe i should abort it? |
Send message Joined: 6 Jul 06 Posts: 147 Credit: 3,615,496 RAC: 420 |
They are probably not running the same model types. The short ones give 94.52 cobblestones per trickle (on my 2.6GHz Opteron that happens every 6 hours or so and will complete in less than 4 months). The non optimised long ones give 259.20 cobblestones per trickle (every 13 hours for me). The optimised long ones give 310.80 cobblestones per trickle (about 13 or a bit more hores each trickle). So if one has a long model and the other computer has a short model then overall the one with the longer model will get a higher credit output. A short model will give 94.52 x 2 = 189.04 cobblestones in 12 hours. A long model will give either 259.20 or 310.80 cobblestones in 13 hours. So this could be the reason the RAC has dropped on one computer compared to the other. Plus I noticed that you have had a couple of models crash as well, this will affect RAC. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Following your statement that the last recorded trickle was in September (where did you get that information btw?) It\'s on the model\'s page, linked to from your Account page. There\'s LOTs of info there, which is why it\'s helpful to be able to see it, in order to provide help. We can get there by clicking on your name to the left of your posts. This is the \"publicly viewable\" info. \"You\" can get there the same way, or you can see the extra \"private data\" by going to your Account page in any of several ways: There\'s a link at the left side of your BOINC manager, and there\'s also the Account option in the blue menu to the left of here. You can then either click on computers to see the info about them, and then click on Results to get to a page of all your models, or go directly to the Results page from your main Accounts page. Backups: Here |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
When you drill down as Les describes, you\'ll eventually get to this page for the climate model I highlighted: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6728892 If this is the one which is still \'running\', then yes, I\'d abort it. Given the amount of time it has been running since the last trickle, and that it\'s a phase-3 slab model, I suspect that it has not gone back to the beginning, but turned into an iceworld instead (A rewound model would probably have been automatically aborted by this time). When a slab model turns into an iceworld, it will continue running forward, but in some cases starts going incredibly slowly. The first symptom of either a rewound model or a slow iceworld is that you stop receiving \'trickles\'. That PC has been successfully running climate models for a long time, so this bad model is probably a one-off. But if it happens again on the same PC, then I\'d run the torture test (I run it on my PCs just to check they\'re OK, around once per year). I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 29 Nov 05 Posts: 5 Credit: 6,359,893 RAC: 0 |
Les Bayliss: Thanks for your helpful advice. I usually don\'t have sufficient time to delve deeper into the details of the projects I\'m participating so when I run into a serious problem (seldom enough though) I totally depend on help from the community. Now that worked perfectly again and with your information I will be capable of detecting possible causes for problems in the future better without bothering the real experts. MikeMarsUK: I\'ve aborted the strange model and will check over the next few days, what happens to my overall rac. Incidentally your idea about a possible iceworld could be true, since the remainig time for the project actually went up 4 hours compared to yesterday. Conan: Thanks for your reply either, its always nice to notice when somebody is interested in someone else\'s trouble. |
©2024 cpdn.org