Message boards :
Number crunching :
Is HyperThreading BAD for Climate?
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Sep 05 Posts: 8 Credit: 205,423 RAC: 0 |
There seems to be a lot of cache misses and page faults when I run Climate BOINC on a two physical Xeon configuration with HyperThreading on. The smaller projects, like SETI and Protein, seem to do just fine with HyperThreading, but has anyone confirmed that Climate might do better with HyperThreading off? |
Send message Joined: 28 Aug 04 Posts: 13 Credit: 767,708 RAC: 0 |
There seems to be a lot of cache misses and page faults when I run Climate BOINC on a two physical Xeon configuration with HyperThreading on. The smaller projects, like SETI and Protein, seem to do just fine with HyperThreading, but has anyone confirmed that Climate might do better with HyperThreading off? Well, I know that two climate WUs on my P4 HT System are quite bad. I think it might be because the two programs are trying to use the same \'parts\' of the CPU and thus in a bottleneck. With enough other projects it\'s quite easy to have climate together with some other work. Just a non-professional opinion |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
Most estimates I have seen come up with a 15 to 20 % throughput improvement with HT on. (Each model runs slower but not twice as slow.) Visit BOINC WIKI for help And join BOINC Synergy for all the news in one place. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I agree with crandles on this. But I don\'t have a Xeon, just a P4. |
Send message Joined: 15 Sep 05 Posts: 8 Credit: 205,423 RAC: 0 |
I agree with crandles on this. But I don\'t have a Xeon, just a P4. Be aware that a computer configured with HyperThreading ON, but BOINC limited to the number of physical processors will waste half of its time looking for work for the logical processors and only contribute one-half of its capacity to BOINC-based projects. For example, on a two-physical CPU machine with HyperThreading ON, but BOINC limited to two (2) CPUs,you will have two BOINC-based processes each using 25% of CPU capacity and Windows XP Pro will use 2% for the Task Manager and 48% running the SystemIdle loop, which actually consumes resources. My advise is to configure a two physical CPU machine with HyperThreading OFF (to see faster progress) or with HyperThreading ON and matching the logical CPU count in BOINC (to get slightly more ultimate processing completed). Of course, on projects other than ClimatePrediction,you will see 30-35% more throughput with HT ON, so I\'m running that way across the board now. The default configuration for BOINC seems to be for a single physical HT processor with HT on. That\'s why it is set to two. |
Send message Joined: 5 Feb 05 Posts: 465 Credit: 1,914,189 RAC: 0 |
I have a P4 HT, running with HT, and have had no issues with 2 CPDN running simutaneously. I have 896M of memory (1G - 128 shared for video). Been running this through several WUs, and never had one fail, yet. So in my experience, no there is no issue with an HT running 2 CPDN WUs simutaneously. |
Send message Joined: 15 Sep 05 Posts: 8 Credit: 205,423 RAC: 0 |
... and never had one fail, yet. In rereading the thread, I don\'t see failures mentioned, but when mixing the large and small models there appears to be some pretty inefficient processing (what with cache issues et al). So much so that I would suspect that running without HT might be faster. But, in the long run, it probably isn\'t that big a deal. Thanks for your post. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
... and never had one fail, yet. Hyperthreading performance depends on lots of things. For example, on this machine I have two 3.06GHz hyperthreaded Xeon processors, the model with an L3 cache as well as the traditional L1 and L2 caches. The L3 cache is a megabyte, and the memory is the kind where you need a multiple of two memory modules because they run in parallel. So the memory is better able to keep the L3 cache filled, and the 512 KByte L2 cache gets its instructions and data from the L3, and the L1... . So if I were running four instances of ClimatePrediction, all the same application program, perhaps a fair amount of the working set of instructions and a bit of data would be in the cache and running pretty fast. I never tried measuring this with ClimatePrediction, but I tried to get a qualitative handle on it for SetiAtHome. When I ran 1 process, it ran pretty fast, and when I ran 2, it was almost as fast, but when I ran 4, it slowed down about 20%. Note that the total throughput was monotonic increasing (do not know what would happen with a 4-processor hyperthreaded machine: at some point it flattens out and AFAIK, it might even get worse by adding more processes), so the individual proccess were taking longer to complete. If I also run a database application, the presense of BOINC processes really hurts, and offhand you would not expect this, since I run on Linux and the BOINC stuff runs only when the machine cannot use the processors for other tasks. ANd the process scheduler is working correctly. The trouble is that the dbms starts an IO operation and gets suspended. The lower priority BOINC application is dispatched, dirtys up the cache, and when the dbms IO completes, it gets the processor, but with a dirty cache, so it is dealing with the 533MHz memory instead of the 3.06GHz cache, slowing things down. Answering the question: is it better to have hyperthreading on or off is a very difficult thing to answer because so many things enter into the calculations. |
Send message Joined: 15 Sep 05 Posts: 8 Credit: 205,423 RAC: 0 |
AGREED. Once I noticed that the HT machine would SystemIdle the logical processors and, therefore, not fully utilize the CPUs, I went back to four processes on four logical processors. The room temp is up a bit, but the overall throughput is better. |
Send message Joined: 5 Feb 05 Posts: 465 Credit: 1,914,189 RAC: 0 |
I recently bought a P4 2.8G non-HT, and I have a P4 2.8 HT. The non-HT unit is out performing CPDN by over double. Both are doing the same projects with the same percentages, and the non will do one WU in 23.5 days, so 2 in 47 days. It take the other to do 1 in 53 days (cause I do other projects, so it is not running 2 simutaneously, but when it was running 2, it still take 53 days). There is a difference, the HT is a laptop, and does run hotter than the desktop. All the other projects are close to 2-1. My thought is the L2 Cache is the reason. I have not turned off HT on the laptop to test, but after seeing this for over a week, I might shut it off and see if I get a better performance hit, and a little less heat. |
Send message Joined: 31 Aug 04 Posts: 239 Credit: 2,933,299 RAC: 0 |
Trying to make comparisons of HT vs. non-HT are hard. For one thing, OTHER than the HT capability the CPU, motherboard, memory, etc. ALL have to be the same for a valid comparison. Laptops, CPU being the same clock speed or not, are no where compatible with the other components of a standard desktop. Laptop components WILL be slower. Running HT on, will give you a 20-60%, nominal 40% improvement in throughput. Slower speed, but greater total performance. Exactly wat you will get depends on exactly what you are doing at the time. The HT mode works by running #2 thread when #1 thread hits a roadblock or when it is not using some of the internal components. Like multi-tasking on a computer, HT mode is a way to get the most out of the total system ... |
Send message Joined: 15 Sep 05 Posts: 8 Credit: 205,423 RAC: 0 |
I agree that HT will typically provide better throughput, but not even Intel suggests the high end of that range. I think 15-30% is more likely. The most obvious improvement is on a system where HT is ON, but the BOINC is set to limit to the number of physical CPUS (which is the default for a 2 CPU system). You about double your throughput because the SystemIdle process is doing real work of no benefit. I have identical dual CPU HT-capable Xeon ia-32 Dell 650\'s. I\'ve run tests with SETI, Protein and Climate (Folding failed on floating point problem). Both systems identical down to XP patch level and software installed. Only Climate doesn\'t seem to benefit much from HT, and it isn\'t worth bouncing the machines to turn HT off just for Climate. |
Send message Joined: 15 Sep 05 Posts: 8 Credit: 205,423 RAC: 0 |
The non-HT unit is out performing CPDN by over double. Both are doing the same projects with the same percentages Do a CTRL_ALT_DELETE to bring up your system monitor and see what\'s using CPU on the HT machine. My bet is the BOINC projects might not be using the full CPU available, for some reason. On my HT boxes, the work units clock at twice as long, but two are produced in that period. The clock isn\'t CPU usage, but system clock. So, if two work units start at 00:00 and end at 00:59, they both show 00:59 but each took half that. On my machines, setting to non-HT generates a LOT less heat. On a laptop, that probably means something to battery life. |
Send message Joined: 7 Aug 04 Posts: 2185 Credit: 64,822,615 RAC: 5,275 |
There is a difference, the HT is a laptop, and does run hotter than the desktop. All the other projects are close to 2-1. My thought is the L2 Cache is the reason. I have not turned off HT on the laptop to test, but after seeing this for over a week, I might shut it off and see if I get a better performance hit, and a little less heat. My P4 laptop with hyperthreading slows/throttles down when it gets hot. Instead of running at 3.06 GHz, it slows down to 2.1 GHz. Many laptops will throttle back either by slowing the CPU down GHz wise, or by inserting idle commands. You can test it with throttlewatch if you want to see if it is doing it. |
Send message Joined: 8 Sep 04 Posts: 23 Credit: 121,446 RAC: 0 |
I have the exact same system as TheSleuth, i have hyperthreading on, but limit BOINC to 2 CPUs (the idea being that BOINC gets the real one (ideally) and i\'m using the hyperthreading bit unless i\'m doing some real work on it) I recently tried running 2 CPDN models as well as other projects, when i only run one model, and another project, i get about 3 s/TS. With 2 models running together, i got about 4.0-4.5 s/TS so quite an improvement personally, so imo, it\'s worth having HT on :) |
Send message Joined: 15 Sep 05 Posts: 8 Credit: 205,423 RAC: 0 |
For the most recent 30-day period, I have been running non-HT with a 2-CPU limit on one machine, and yes-HT and 4-CPU limit on the other. The two-CPU limit non-HT machine has produced slightly higher credits (+/-5%) over that period and it is suspended for real work much more often. This might be because the sulf tests get more credit? When you run yes-HT but two-CPU, you are letting SystemIdle take half of your capacity, at least in my tests. |
Send message Joined: 8 Sep 04 Posts: 23 Credit: 121,446 RAC: 0 |
For the most recent 30-day period, I have been running non-HT with a 2-CPU limit on one machine, and yes-HT and 4-CPU limit on the other. The two-CPU limit non-HT machine has produced slightly higher credits (+/-5%) over that period and it is suspended for real work much more often. This might be because the sulf tests get more credit? ah, well yes, and i suppose each could be running on a seperate physical CPU as well (leaving the 2 virtual ones free) the reason i have 4 CPU with a limit of 2 is because i find BOINC affects system performance when doing intensive tasks, especially games, so i limit it to 2 mostly, sometimes 3 if i know i won\'t be doing anything stressful for a while (1 spare for the system to have at it\'s disposal) |
Send message Joined: 27 Aug 05 Posts: 156 Credit: 112,423 RAC: 0 |
Running a 840ee 3.2 with HT, 3 gig ram, running 6 projects; Einstein, Seti, Rosetta, LHC, CPDN, and Predictor. Presently have 10 Ts that range from 2.1544 to 2.1648 completion time was estimated at 470 hrs but has gone up to about 530 hrs. Just started 2 Sulphur runs on my other 840ee 3.2 with HT, 3 gig ram, estimated time 1470 hrs each...also is running Einstein and Seti. Both running great so far..... BOINC Wiki |
©2024 cpdn.org