Message boards : Number crunching : CPU Upgrade at 50%
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
I upgraded both my CPUs from 1.2GHz to 2.0Ghz Athlon MP. Unfortunately, one of the chips was bad. :( So, I\'m waiting on a replacement from the vendor. Meanwhile, I\'ve been running CPDN with only one 2.0GHz CPU. So far, I\'m happy to report that BOINC is still working fine. I don\'t think it\'s trickled up any data since the upgrade, but from the earlier forum posts, I think such a simple upgrade (keeping same Linux kernel) should avoid any problems. My sec/TS has dropped to 4.6 from 4.9. At least I\'m above the minimum specs for the project now! AMD has been putting their superior technology in the MP processor line. Does anyone think I\'ll get more than 67% increase in speed? I\'m going from the Palomino core to the Barton core. Benchmark speeds were: 1.2GHz: 1033 MIPS floating, 1512 MIPS integer; 2.0GHz: 1712 MIPS floating, 2529 MIPS integer; Running BOINC version 5.8.15. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Starfox I can\'t comment on the speed you\'ll now get, but a word of warning. What you\'re doing is like transferring your model(s) from a slower to a faster machine. In these circumstances, it\'s possible that a model can crash before it completes with the message \'maximum CPU time exceeded\'. This is because boinc doesn\'t realise that the model spent some of its life on a slower machine. There\'s a fix for this which involves editing one of the files to increase the amount of CPU time allowed. It\'s really best to avoid this editing if possible. The easiest thing is simply to make regular backups. So if your model DOES crash with this error, you could then edit the file and restore the backup. There\'s a selection of backup methods available through the link in my sig. You need to exit from boinc first, then back up the entire contents of the boinc folder. If you get a crash with this message, you\'ll need to post again to ask about how to edit the file. Cpdn news |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
That is quite strange. The first thing BOINC did was run benchmarks after it detected the CPU count change, so BOINC should be aware of the new speed. I just made a backup of the folder. So, I\'ll just let it run for now and only edit the file in the event of a crash. In the unfortunate event that it crashes, should I reply here or post the problem in the Q/A forum? |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi again The new benchmarks are as far as I can see the cause of the problem. Boinc assumes that the model ran at its new faster speed since it started in 1820 and allocates a new reduced number of floating point operations permitted. But while the model was running more slowly, it used proportionately too much of this. So the model may hit the new limit before it completes. There has to be a limit to prevent looping workunits from looping (if they develop a loop) indefinitely. I expect there\'s an extra margin for reruns/looping/backups built in to the number allowed. I think there must be this extra margin, otherwise every backup would fail before it completed with this error. But the error is in fact quite rare. The more advanced a model is when it\'s transferred, and the greater the difference of speed, the greater the chance of this error occurring. The problem was discusssed here: http://www.climateprediction.net/board/viewtopic.php?t=7001 Thyme Lawn\'s instructions are what\'s required. If you prefer, you could increase the number now to preempt the problem. If you do decide to do this, you\'d better make a backup before you try the edit. I didn\'t dare try it and simply transferred the model back to the old slow computer (which has had to be slowed down by 25% to keep it working). Nobody else has posted to say they\'ve tried. If you want to post again about this at any time, I\'d do it in this same thread here so that all your posts are together. Cpdn news |
Send message Joined: 9 Jan 07 Posts: 497 Credit: 342,899 RAC: 0 |
Having transferred a BBC model from an old computer to a newer, faster one in February, I have the impression that the s/TS continues to be calculated as an average from the beginning of the model - i.e. the s/TS has kept dropping ever since the transfer - it\'s now 3.67 - but it\'s never going to show the real current s/TS. Visit the Scotland team |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi MM You\'re right about the timesteps and so Starfox will need to measure them manually on the new computer to see how it\'s performing. Cpdn news |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Well, I\'m going to wait and see. Based on what you just said, my current model is fairly low risk. I\'m only 4% done so far. I\'ll reply if the thing dies on me. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Yes, Mike is saying that you\'d need to speed the model up much more than you anticipate, and the model would need to be more advanced at the time of the move, for this problem to occur. In which case I\'m sorry to have troubled you about it. I\'ll soon be putting an advice item about this into the Running the model README; it\'ll be in the part dealing with moving to a new computer. My post will contain click-by-click instructions for editing the xml file, so if you do have the bad luck to run into this problem, that will be where to look. Before any hardware change, it\'s still a very good idea to back up the complete contents of the boinc folder. And eg weekly thereafter. Cpdn news |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Yes, Mike is saying that you\'d need to speed the model up much more than you anticipate, and the model would need to be more advanced at the time of the move, for this problem to occur. In which case I\'m sorry to have troubled you about it. I\'ll soon be putting an advice item about this into the Running the model README; it\'ll be in the part dealing with moving to a new computer. My post will contain click-by-click instructions for editing the xml file, so if you do have the bad luck to run into this problem, that will be where to look. The readme sounds like a great idea. Perhaps even a FAQ. :-P BTW, I just found your post about it: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=5512 |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
I hope you don\'t have to use it! I didn\'t realise you\'d only done 4% - your model should be fine. I\'d done about 85% when the model crashed with this message, so I absolutely didn\'t want to lose it. I started this model in April 2006 on a really slow computer that doesn\'t meet the minimum project CPU specs. The fix was much easier than I expected. Normally I expect that only fairly advanced users would dare edit an xml file, or know how to do it. The idea is to make this fix possible for almost every member. The post is the combined knowledge of 4 mods.... Cpdn news |
Send message Joined: 11 Jun 05 Posts: 67 Credit: 1,222,916 RAC: 0 |
......i.e. the s/TS has kept dropping ever since the transfer - it\'s now 3.67 - but it\'s never going to show the real current s/TS. You\'re right - it\'ll just show the average. However, if you look at \"Your Results\" and your Trickles, you\'ll see a sudden step-change in your s/TS measure. Neil. |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
......i.e. the s/TS has kept dropping ever since the transfer - it\'s now 3.67 - but it\'s never going to show the real current s/TS. I didn\'t see the sudden change you mentioned, as the trickles page show aggregate values. However, I calculated the \"instantaneous\" s/TS for each trickle. I did this by taking the delta of CPU time divided by the delta of timestep between trickles. A few calcs gave me an average of 3.7 s/TS. It\'s certainly an improvement over 4.9 s/TS although not as much as I had hoped for. |
©2024 cpdn.org