Message boards : Number crunching : Relation between CPU L2 cache size and crunching speed
Message board moderation
Author | Message |
---|---|
Send message Joined: 23 Jul 07 Posts: 4 Credit: 387,306 RAC: 0 |
Mow much does the amount of L2 cache on CPUs affect the crunching speeds on climatepridection.net? Assume that all other factors are equal. In climatepridiction.net, How much faster will an Intel E4400 (2GHz and 2MB L2 cache) fare over an Intel E2180 (2GHz and 1MB L2 cache)? Has anyone done any tests investigating the impact of L2 cache size on crunching speeds on climatepridiction.net? |
Send message Joined: 6 Aug 04 Posts: 264 Credit: 965,476 RAC: 0 |
I am running a hadam3h model on an Opteron 1210 with 2 cores with 1 MB L2 cache each. RAM is 2 GB, OS Linux. The RAM usage of the model varies violently from 5% to 20%. This is the only case in 6 BOINC applications (Einstein, SETI, QMC, climateprediction.net, CPDN Beta, LHC) in which I see the RAM usage so variable in short times and I am not able to explain it. Tullio |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
It\'s inherent in the design of the application. It\'s also the reason participants won\'t receive one unless they have at least 1.5 GB RAM. It\'s a higher resolution Model than the others (so far) and it grabs what it needs when it needs it, then releases what it doesn\'t need when the \"surge\" (sorry) is done. Until next time. (It is beautifully displayed in Linux memory-use graphics.) "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
|
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
It\'s been a long time, but when running the original spinups that preceded the coupled models (hadcm3\'s), I had an Athlon64 3400+ and 3700+ (both Socket 754). Both ran at 2.4 GHz, had RAM running the same memory timings, and the only difference was the L2 cache on the 3700+ was 1 MB vs. 512 KB on the 3400+. The 3700+ ran the spinup at about 1.74 s/TS while the 3400+ ran it at 1.83 s/TS. So, assuming no significant difference in speed due to model parameters, the 3700+ was about 5% faster. On the original seasonal experiment using the hadam3 model, I ran a 3 GHz Pentium 4 with 512 KB L2 cache, and later a 3 GHz Pentium 4 with 2 MB L2 cache. The 512 KB cache processor ran it at about 22.5 s/TS while the 2 MB cache processor ran it at 19.3 s/TS. Of course there was a difference in RAM as well, as the CPU with the larger cache was also paired with DDR2 memory as opposed to DDR1. |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
It\'s discussed in the final posts by MikeMars and Geophi in this thread. I also asked and got a reply about that here. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=5935&nowrap=true |
Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0 |
Maybe I\'m OT, what means s/TS and where I can read this value for my system? thanks |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
In Boinc manager click on \"show graphics\", when the graphics window is open then press the key \"Z\". If you would like to, press the key\"H\" for help. Regards Masud. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
Maybe I\'m OT, what means s/TS and where I can read this value for my system? ... KSMasud has suggested one good place; another is on the Web page for your model, after the model has sent a \'trickle\' (after one model year). s/TS is simply \'seconds per timestep\' - i.e. the number of CPU seconds used by the model, divided by the number of timesteps the model has completed. It gives an estimate of the speed of the model on your computer. Different computers have different values for s/TS and different model types also have different values (HADSM3 smallest, HADCM3 larger, HADAM3 largest). If you run your model all the time, then s/TS multipled by the number of timesteps in the completed model would give the minimum time that the model would take to finish. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
The \'sec/TS\' abbreviation means seconds per timestep. As well as seeing the value in each model\'s graphics display, it\'s in the last column of the web page for each model. For example, when your new model has produced a trickle, you\'ll see it there. The 3 types of model typically produce different sec/TS values. Edit - Iain answered first! Cpdn news |
Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0 |
Thanks for the answers, so (if I have understood correctly :)) the lower the value the faster is the machine, is right? Now it says 1.28 s/TS but it\'s going down.. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
Yes, that\'s right. The figure shown on the trickle page (e.g. 7606940) is the average over the whole model run, so it changes quite slowly when things cause the PC to speed up or slow down. If you want to find out what\'s happening right now, then take the difference between the CPU Time figures and divide by the number of steps in a trickle. Model 7606940 is a \'slab\' model (HADSM3), which has 10,802 timesteps per trickle, so for the trickles submitted at 04 Sep 2008 14:48:32 and 04 Sep 2008 09:47:03: - the average sec/TS is 152,546 / 118,822 = 1.2838 sec/TS - the \'current\' sec/TS is (152,546 - 140,186) / 10,802 = 1.1442 sec /TS So, the machine has sped up quite a bit, and the average speed is catching up with the real speed. |
Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0 |
Now I\'ve push up my CPU (Athlon X2 4850e) to 2.8 Ghz and s/TS has go down to 1-1.01 it is good? does the Core 2 cpu get better figures at the same speed? |
Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0 |
Hard to tell. On a Q9450 I had ~ 0.8580 (FSB set to 400), which would be about 4%-5% faster (calculated for 2.8GHz), but it has fast dual channel RAM, which sure plays a role too. I don\'t think that the AMD CPUs are really slower, as long as a program is not especially optimized for Intel. |
Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0 |
Hard to tell. On a Q9450 I had ~ 0.8580 (FSB set to 400), which would be about 4%-5% faster (calculated for 2.8GHz), but it has fast dual channel RAM, which sure plays a role too. Currently I have one stick of 2gb DDR2-800 so single-channel only... on Rosetta 2 months ago I hadn\'t seen changes on performances between single or dual channel (2x2Gb) what about CPDN? the Intel aren\'t faster even with the other two types of WUs? P.S= Excuse for my English but it isn\'t my native language, I\'m Italian ;) |
Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0 |
From my experience I would say, that there is a relation between memory throughput and crunching speed. If I crunch two CPDN models on a dual CPU or dual core computer, it crunches a bit slower than having only one CPDN model combined with a different project. This effect has been quite strong on old P3 Tualatin boxes and even worse on Athlon MP, but I can even see it on current dualcore and quadcore 45nm CPUs. So if you have two main projects, it is more efficient to run CPDN and the other project concurrent instead of altering between the projects. POEM might be an exception as POEM needs a high memory throughput itself for a good speed. Very good combinations are SIMAP+CPDN and Spinhenge+CPDN. As you can see the change within 2 trickles, it\'s fairly easy to figure out, which \"co-project\" matches CPDN good. As I cannot think of any other limited ressource, that (without graphics usage) two concurrent workunits would compete for, my conclusion is that it has to be the RAM throughput. Of course, HD throughput is a shared ressource as well, but CPDN does not torture the HD so that cannot be the factor. p.s.: I\'m not a native English speaker either :-) |
Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0 |
For the moment I\'ve chosen to give a core to CPDN and the other to Folding@Home so no switching between the two projects. In October or November this system will become an HTPC and I think to get a Q6600 platform to improve work/power-consumption ratio. But, after the RAC will be stable, I\'ll do some tests adding the other 2Gb stick. |
©2024 cpdn.org