climateprediction.net (CPDN) home page
Thread 'processors, memory, performance and heat.'

Thread 'processors, memory, performance and heat.'

Message boards : Number crunching : processors, memory, performance and heat.
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
ProfileThomas McFarland
Avatar

Send message
Joined: 28 Feb 05
Posts: 20
Credit: 11,298,916
RAC: 15,441
Message 70572 - Posted: 29 Feb 2024, 13:46:14 UTC - in response to Message 70532.  
Last modified: 29 Feb 2024, 13:48:28 UTC

Unless its causing WU crashes, it really helps with controlling heat issues on some rigs while still using all cores.


Use at most x% of cpu time should be removed from boinc source code
ID: 70572 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 70573 - Posted: 29 Feb 2024, 14:47:18 UTC - in response to Message 70572.  

Unless its causing WU crashes, it really helps with controlling heat issues on some rigs while still using all cores.
I get that. I just think reducing the number of cores in use is a better option.
ID: 70573 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 70574 - Posted: 29 Feb 2024, 18:17:40 UTC - in response to Message 70573.  

I just think reducing the number of cores in use is a better option.


That also helps with improving cache-per-task, which can make a big difference in per-task performance.

Though on most systems, if you're having thermal problems, the right answer is to tweak the power limit settings in the BIOS or with some mainboard utilities. You can clamp that down and not worry about what's loading up the cores, and you usually get a pretty nice boost in compute-per-watt.
ID: 70574 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 70575 - Posted: 29 Feb 2024, 21:39:10 UTC - in response to Message 70572.  

Unless its causing WU crashes, it really helps with controlling heat issues on some rigs while still using all cores.


Use at most x% of cpu time should be removed from boinc source code


At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time.
ID: 70575 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,578,380
RAC: 15,009
Message 70576 - Posted: 1 Mar 2024, 8:59:08 UTC - in response to Message 70575.  

At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time.
That will have a bigger impact on the throughput (tasks completed per day). A 10% drop in CPU use on all cores is all I need on my older machines to get temps I'm happy with. Taking 1 core away from 4 is the same as a 25% CPU use reduction. I prefer having the finer control of %cpu available.
ID: 70576 · Report as offensive     Reply Quote
kotenok2000

Send message
Joined: 22 Feb 11
Posts: 32
Credit: 226,546
RAC: 4,080
Message 70577 - Posted: 1 Mar 2024, 9:29:47 UTC
Last modified: 1 Mar 2024, 9:32:06 UTC

What about dropping cpu frequency 10% with Ryzen master, amd overdrive for older cpus or Intel Extreme Tuning Utility?
ID: 70577 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 70578 - Posted: 1 Mar 2024, 9:34:03 UTC - in response to Message 70576.  

At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time.
That will have a bigger impact on the throughput (tasks completed per day). A 10% drop in CPU use on all cores is all I need on my older machines to get temps I'm happy with. Taking 1 core away from 4 is the same as a 25% CPU use reduction. I prefer having the finer control of %cpu available.


If you’re happy with the reduced life of the CPU then fine, just be aware that you are making that choice.
ID: 70578 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,578,380
RAC: 15,009
Message 70579 - Posted: 1 Mar 2024, 10:54:12 UTC - in response to Message 70578.  

These are old intel chips that are already 10yrs old. I'm not worried. Intel's newer CPUs are designed to run hotter than AMD too.
ID: 70579 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 70587 - Posted: 2 Mar 2024, 18:15:57 UTC
Last modified: 2 Mar 2024, 18:25:49 UTC

Yeah, a new thread on core speed, s/TS, versus watts used on different chips would be fun to see.
Don't think my little undervolted AMD 5600G 6-core with only 16 Mb L3-cache is so bad against the bigger ones.
With 5 cores using 130W it takes around 7 days for wah2 8.29 or 4.37 kWh/task
Running 2-3-4 cores is faster but not much.
ID: 70587 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 70588 - Posted: 2 Mar 2024, 19:00:09 UTC - in response to Message 70587.  

With 5 cores using 130W it takes around 7 days for wah2 8.29 or 4.37 kWh/task
Running 2-3-4 cores is faster but not much.


My main (Linux) machine is consuming 275 watts and running 13 Boinc processes. (None of them ClimatePrediction).
The 275 watts includes the computer, the router, and the monitor.
ID: 1511241
Number of processors 	16
Memory 	 125.07 GB
Cache 	  16896 KB
Swap space 	15.62 GB
Total disk space 	488.04 GB
Free Disk Space 	480.47 GB
Measured floating point speed 	5.92 billion ops/sec
Measured integer speed 	       23.22 billion ops/sec
Average upload rate 	  194.32 KB/sec
Average download rate 	15613.09 KB/sec
Average turnaround time 	7.96 days

Every 11.0s: sensors  localhost.localdomain: Sat Mar  2 13:33:14 2024

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +75.0°C  (high = +88.0°C, crit = +98.0°C)
Core 8:        +68.0°C  (high = +88.0°C, crit = +98.0°C)
Core 2:        +66.0°C  (high = +88.0°C, crit = +98.0°C)
Core 3:        +71.0°C  (high = +88.0°C, crit = +98.0°C)
Core 5:        +70.0°C  (high = +88.0°C, crit = +98.0°C)
Core 1:        +75.0°C  (high = +88.0°C, crit = +98.0°C)
Core 9:        +74.0°C  (high = +88.0°C, crit = +98.0°C)
Core 11:       +67.0°C  (high = +88.0°C, crit = +98.0°C)
Core 12:       +65.0°C  (high = +88.0°C, crit = +98.0°C)

ID: 70588 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 70590 - Posted: 3 Mar 2024, 18:55:57 UTC

Please can new posts on this be put in here rather than in the New work thread.

Thank you.
ID: 70590 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,578,380
RAC: 15,009
Message 70612 - Posted: 5 Mar 2024, 22:54:34 UTC
Last modified: 5 Mar 2024, 22:57:54 UTC

Some performance numbers for the WaH batch 1006 in the database. For the top 10 fastest run tasks we have:

    -- The fastest task took a CPU time of 1.27 days on a 13th gen Intel i9-13900K. The user was in the United States.
    -- Next 2, took CPU time of 1.47 days. As the user has the computer hidden I won't post the details.
    -- Next 2, CPU time of 1.67 days on a 12th gen Intel i9-12900K. United States.
    -- Next 5, CPU times ranging from 1.9-2.1 days on a 12th gen Intel i7-12700H. Canada.

We also record the 'completion time', the time from when the host computer received the task to when the result came back to CPDN. The fastest completion time was 5.2 days. Median completion time is currently 15 days. Median cpu time is ~12 days.

The cpu time taken by a task depends heavily on the workload on the machine. To get the fastest cpu time it would need a single task running on as quiet as machine as possible. That's not best for throughput but it will give the best cpu time. Interesting (for me at least) to see a laptop (the 12700H) in the top 10!

Enjoy :D


---
CPDN Visiting Scientist
ID: 70612 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 317
Credit: 14,925,468
RAC: 12,903
Message 70613 - Posted: 6 Mar 2024, 8:41:53 UTC - in response to Message 70612.  

Looking around, seems like i9-13900K is near the top of single-core performance. Multi-core too (outside of Threadrippers) but this doesn't make a difference for CPDN. Seems like a good chip.
ID: 70613 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 70614 - Posted: 6 Mar 2024, 11:08:56 UTC - in response to Message 70613.  

Might have to get a new box sometime. My Ryzen 7 3700X was fast when I got it.
ID: 70614 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,578,380
RAC: 15,009
Message 70615 - Posted: 6 Mar 2024, 14:40:19 UTC - in response to Message 70614.  

You're on the right forum if you want some encouragement!
ID: 70615 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 70617 - Posted: 6 Mar 2024, 16:17:15 UTC
Last modified: 6 Mar 2024, 16:18:34 UTC

I've got a pair of 3900Xs that do most of my computation (12C/24T), and I've found that I see almost no "net system throughput" improvements between 8 and 12 threads running with CPDN tasks - it may be marginally faster at 12, but not by much (mine are typically retiring 50-60G instructions per second when loaded). Going up past 12 actually reduces net system throughput. I think turbo might increase that slightly, but I generally keep it disabled to avoid the corner of "tons of extra power for a slight bit extra performance."

There doesn't seem to be any benefit to hyperthreading with CPDN tasks (making sense, they're floating point/vector engine heavy), and they seem to prefer "enough cache" - though I think there's still a ton spilling to main DRAM, based on counters on my Intel boxes.

I don't care a bit about single threaded speed for CPDN, just total system throughput. But Dave clearly needs some test chips! ;)
ID: 70617 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 70618 - Posted: 6 Mar 2024, 16:43:10 UTC - in response to Message 70617.  

I do wonder if faster RAM might help. Potentially I might need more than 32GB for some testing with OIFS even if on main site they are rationed to avoid problems with machines that don't have enough for multiple tasks.
ID: 70618 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 70620 - Posted: 6 Mar 2024, 18:15:25 UTC - in response to Message 70618.  

I do wonder if faster RAM might help. Potentially I might need more than 32GB for some testing with OIFS even if on main site they are rationed to avoid problems with machines that don't have enough for multiple tasks.


My machine has this memory at the moment.
CPU type Genuine Intel - Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7]
Number of processors 	16
Operating System    Red Hat Enterprise Linux 8.9 (Ootpa) [4.18.0-513.18.1.el8_9.x86_64|libc 2.28]
BOINC version 	7.20.2
Memory 	125.07 GB [2933MHz DDR4]
Cache 	 16896 KB

It came with 32 GBytes but I doubled it a couple of times as prices for RAM came down.
I guess it is no longer state-of-the-art (if it ever was), but it is several years old now, so there must surely be faster machines out there now.
I cannot put faster RAM in there, but I could run it up to 512 GBytes if someone would send me the money to do it. I doubt there is much point to doing that, since my L3 cache is 16384 Kbytes, which is pretty good for that kind of processor chip, I got all that RAM to run all those OIFS tasks that I have not received since last June, IIRC.
ID: 70620 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,578,380
RAC: 15,009
Message 70621 - Posted: 6 Mar 2024, 21:03:04 UTC - in response to Message 70618.  
Last modified: 6 Mar 2024, 21:03:40 UTC

I doubt it, cpu speed is what makes the biggest difference. I like the intel chips currently; more cores than Ryzen for same/less money, good single core performance, better memory latency than Ryzen. The mid-range i5 (or i7) is good value for money. Chip cache won't make much difference because the code is not optimized for specific cache sizes. Plus DDR5.

I do wonder if faster RAM might help. Potentially I might need more than 32GB for some testing with OIFS even if on main site they are rationed to avoid problems with machines that don't have enough for multiple tasks.

---
CPDN Visiting Scientist
ID: 70621 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 70622 - Posted: 6 Mar 2024, 23:10:17 UTC

Do Linux users know about this interesting tool?

#
 perf stat -e cache-references,cache-misses,cycles,instructions,branches,faults
^C
 Performance counter stats for 'system wide':

     4,751,265,017      cache-references                                            
     1,957,008,106      cache-misses              #   41.189 % of all cache refs    
 1,416,865,456,289      cycles                                                      
 1,984,715,137,591      instructions              #    1.40  insn per cycle         
   273,726,331,297      branches                                                    
            50,751      faults                                                      

      25.357650625 seconds time elapsed

You start the perf program with the first line. When you think it has run long enough, you hit Control C. It then prints the results.
The machine was doing this; i.e., mostly Boinc work -- 13 boinc tasks
top - 17:56:27 up 11 days,  4:21,  2 users,  load average: 13.58, 13.52, 13.51
Tasks: 483 total,  14 running, 469 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.4 us,  0.1 sy, 80.6 ni, 18.6 id,  0.0 wa,  0.2 hi,  0.0 si,  0.0 st
MiB Mem : 128074.1 total,   2100.0 free,   6385.8 used, 119588.3 buff/cache
MiB Swap:  15992.0 total,  15947.2 free,     44.8 used. 118485.6 avail Mem 


My actual results here are probably of no interest to readers here because none of the Boinc tasks were running any CPDN tasks. But if I ever get more, I will be able to see how they do.

With that work load on my machine, a little over half the memory references were satisfied by the cache.
ID: 70622 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : processors, memory, performance and heat.

©2024 cpdn.org