climateprediction.net (CPDN) home page
Thread 'New work discussion - 2'

Thread 'New work discussion - 2'

Message boards : Number crunching : New work discussion - 2
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 42 · Next

AuthorMessage
ProfileConan
Avatar

Send message
Joined: 6 Jul 06
Posts: 147
Credit: 3,615,496
RAC: 420
Message 66293 - Posted: 7 Nov 2022, 7:52:46 UTC

OpenIFS 43r3 Perturbed Surface,

has been added to the application list, what does it mainly cover?

Thanks
Conan
ID: 66293 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,052
RAC: 15,294
Message 66294 - Posted: 7 Nov 2022, 8:02:25 UTC - in response to Message 66293.  

OpenIFS 43r3 Perturbed Surface, has been added to the application list, what does it mainly cover? Thanks
Conan
it's a modified version of the default OpenIFS in which the surface parameters, instead of the atmospheric ones, can be modified. There are two large (~3000) ensembles planned for this month where each member of the ensemble has slightly different parameters. I'll ask the scientists to write something to the forum about it.
ID: 66294 · Report as offensive
ProfileConan
Avatar

Send message
Joined: 6 Jul 06
Posts: 147
Credit: 3,615,496
RAC: 420
Message 66295 - Posted: 7 Nov 2022, 12:34:25 UTC - in response to Message 66294.  

OpenIFS 43r3 Perturbed Surface, has been added to the application list, what does it mainly cover? Thanks
Conan
it's a modified version of the default OpenIFS in which the surface parameters, instead of the atmospheric ones, can be modified. There are two large (~3000) ensembles planned for this month where each member of the ensemble has slightly different parameters. I'll ask the scientists to write something to the forum about it.


Thanks for that Glenn, appreciated.

Conan
ID: 66295 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 66301 - Posted: 8 Nov 2022, 22:22:29 UTC

Some info on memory requirements on upcoming OpenIFS forecasts. As mentioned previously, we're aiming to increase the model resolution to be more scientifically valuable. These resolutions come with higher memory requirements:

N80 grid, 125km spacing. Peak RAM = 8Gb
O96 grid, 100km " . Peak RAM = 10Gb

N128 grid, 78km " . Peak RAM = 19Gb
O160 grid, 61km " . Peak RAM = 24Gb

https://www.cpdn.org/forum_thread.php?id=9149&postid=66077#66077

I don't recall seeing any discussion on how often the peak is reached, or what happens when a machine's memory is exceeded.
For example, if you have 64 GB, will BOINC send you the right assortment of work units to make best use of that, or do you just wait until more memory is available?

And if the peak is not reached very often, can you run more than two of the 24 GB work units at a time?
ID: 66301 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 66302 - Posted: 8 Nov 2022, 22:58:19 UTC - in response to Message 66301.  

These modles are still in testing.
Please stop peeking through the windows to see what's happening. :)
ID: 66302 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,052
RAC: 15,294
Message 66303 - Posted: 8 Nov 2022, 23:08:33 UTC - in response to Message 66301.  

The peak resident memory is reached every OpenIFS model timestep. Max virtual memory is typically about 15% higher.

The server will not send tasks to machines that cannot safely run them. And we'll probably specify some headroom on the task requirement.

If there's not enough RAM then it'll start swapping, assuming the machine has swap space, which will kill performance.

HTH
ID: 66303 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 66306 - Posted: 9 Nov 2022, 0:51:05 UTC - in response to Message 66302.  

These modles are still in testing.
Please stop peeking through the windows to see what's happening. :)

I am setting up several machines and swapping memory between them for an optimum allocation.
I think I will wait until next Spring when you have it all worked out.
ID: 66306 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 66310 - Posted: 9 Nov 2022, 9:47:15 UTC - in response to Message 66301.  

And if the peak is not reached very often, can you run more than two of the 24 GB work units at a time?
Shouldn't be a problem so long as you have a reasonable amount of swap. Using swap does slow things down a lot. You may get away with it. I don't remember how often but the peak didn't last for all that long so it may even be possible to get away with three or four so long as they aren't peaking all at once. Once the multi-core tasks appear in testing, I should be able to answer some of these questions with a bit more certainty.
ID: 66310 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,052
RAC: 15,294
Message 66311 - Posted: 9 Nov 2022, 9:53:11 UTC - in response to Message 66310.  

And if the peak is not reached very often, can you run more than two of the 24 GB work units at a time?
Shouldn't be a problem so long as you have a reasonable amount of swap. Using swap does slow things down a lot. You may get away with it. I don't remember how often but the peak didn't last for all that long so it may even be possible to get away with three or four so long as they aren't peaking all at once. Once the multi-core tasks appear in testing, I should be able to answer some of these questions with a bit more certainty.
See my message reply to the original post here https://www.cpdn.org/forum_thread.php?id=9149&postid=66303 Don't rely on swapping to accommodate multiple copies of the model. It will greatly impact your PC and significantly impact the task performance. It's a bad idea.

Aim for maximum throughput not maximum number of tasks you can run at the same time!
ID: 66311 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 66313 - Posted: 9 Nov 2022, 11:51:30 UTC - in response to Message 66311.  

Aim for maximum throughput not maximum number of tasks you can run at the same time!


Agreed. I was talking about what was possible. On the laptop, from memory, maximum throughput was running two tasks at once where it was only rarely that both peaked in memory usage at once. With only 8 real cores, I am pretty certain that running two won't be any benefit over running one at a time. I play from time to time to try and learn more about how things work together but having done that nearly always aim for maximum throughput.
ID: 66313 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66314 - Posted: 9 Nov 2022, 12:57:50 UTC - in response to Message 66311.  

See my message reply to the original post here https://www.cpdn.org/forum_thread.php?id=9149&postid=66303 Don't rely on swapping to accommodate multiple copies of the model. It will greatly impact your PC and significantly impact the task performance. It's a bad idea.

Aim for maximum throughput not maximum number of tasks you can run at the same time!
And CPU cache? Running Rainfall tasks for WCG (possibly similar program as it's weather prediction?) if I use all the cores, I get half the throughput on machines without dual channel RAM, and 3/4s on those with dual channel. If I put some of them on other tasks, everything speeds up, as though they like to flood the cache.
ID: 66314 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 66315 - Posted: 9 Nov 2022, 14:07:26 UTC - in response to Message 66313.  

My machine has been up over 10 days since the last reboot for system updates. I use very little swap. It is usually running 11 Boinc jobs at the same time. Seven of these are WCG because those are the only ones that give me work. The other four are universe and milkyway that I run only to keep the cores busy,

$ free -hw
              total        used        free      shared     buffers       cache   available
Mem:           62Gi       5.6Gi       703Mi       141Mi       606Mi        55Gi        55Gi
Swap:          15Gi       119Mi        15Gi


But for some Boinc jobs, the critical factor is the processor cache. My machine has quite a bit of processor cache, but I infer that more would be better.

Computer 1511241

CPU type 	GenuineIntel
Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7]
Number of processors 	16

Operating System 	Linux Red Hat Enterprise Linux
Red Hat Enterprise Linux 8.6 (Ootpa) [4.18.0-372.26.1.el8_6.x86_64|libc 2.28]
BOINC version 	7.20.2
Memory 	62.28 GB
Cache 	16896 KB   <---<<<
Swap space 	15.62 GB

ID: 66315 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66316 - Posted: 9 Nov 2022, 14:18:26 UTC - in response to Message 66315.  

Unfortunately you can't upgrade CPU cache, but you can upgrade the RAM it uses when the cache overflows. Dual channel (as in pairing the RAM chips) helps immensely with big Boinc tasks. And I guess the RAM speed aswell (although that's usually limited by the RAM controller in the CPU).
ID: 66316 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 66317 - Posted: 9 Nov 2022, 15:02:17 UTC - in response to Message 66316.  

Unfortunately you can't upgrade CPU cache, but you can upgrade the RAM it uses when the cache overflows. Dual channel (as in pairing the RAM chips) helps immensely with big Boinc tasks.


My machine came with two of these, and I added two more:
I do not remember if there is room for another four or not. (I must power down the system to open the box.)

Dell Memory Upgrade - 16GB - 2RX8 DDR4 RDIMM 2933MHz

Data Integrity Check ECC
Speed 2933 MHz (PC4-23400)
Dual rank, registered
ID: 66317 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 66319 - Posted: 9 Nov 2022, 16:02:55 UTC - in response to Message 66318.  

All very well if your machine runs Windows. But my main machine runs only Linux and I do not wish to get a license for Windows and dual boot my machine to run both.

"CPU-Z is a freeware system profiling and monitoring application for Microsoft Windows and Android that detects the central processing unit, RAM, motherboard chip-set, and other hardware features of a modern personal computer or Android device. "

(This is not a complaint, just an observation.,)
ID: 66319 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66320 - Posted: 9 Nov 2022, 16:25:58 UTC - in response to Message 66319.  

All very well if your machine runs Windows. But my main machine runs only Linux and I do not wish to get a license for Windows and dual boot my machine to run both.

"CPU-Z is a freeware system profiling and monitoring application for Microsoft Windows and Android that detects the central processing unit, RAM, motherboard chip-set, and other hardware features of a modern personal computer or Android device. "

(This is not a complaint, just an observation.,)
I assumed there would be a linux version, since they even made an Android one which I use. And it isn't exactly freeware, it's adware. I guess there's a linux equivalent somewhere (one of the reasons I use Windows - so I don't have to hunt!) - in fact aren't there command lines in linux to tell you motherboard model, SPD info, etc?
ID: 66320 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 66321 - Posted: 9 Nov 2022, 16:50:18 UTC

sudo dmidecode --type memory
Should give all the information needed under Linux.
ID: 66321 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 66322 - Posted: 9 Nov 2022, 17:24:06 UTC - in response to Message 66320.  

I assumed there would be a linux version, since they even made an Android one which I use. And it isn't exactly freeware, it's adware. I guess there's a linux equivalent somewhere (one of the reasons I use Windows - so I don't have to hunt!) - in fact aren't there command lines in linux to tell you motherboard model, SPD info, etc?


Do you mean like this? (There are others that discus the CPU and its processor cache, etc.)

# dmidecode -t memory 
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.2.0 present.

Handle 0x0009, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: Single-bit ECC
        Maximum Capacity: 3 TB
        Error Information Handle: Not Provided
        Number Of Devices: 8

Handle 0x000A, DMI type 17, 84 bytes
Memory Device
        Array Handle: 0x0009
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 16 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM3
        Bank Locator: Not Specified
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2933 MT/s
        Manufacturer: Samsung
        Serial Number: 43E4FD43
        Asset Tag: 04333361
        Part Number: M393A2K43CB2-CVF    
        Rank: 2
        Configured Memory Speed: 2934 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V
        Memory Technology: DRAM
        Memory Operating Mode Capability: Volatile memory
        Firmware Version: NA
        Module Manufacturer ID: Bank 1, Hex 0xCE
        Module Product ID: Unknown
        Memory Subsystem Controller Manufacturer ID: Unknown
        Memory Subsystem Controller Product ID: Unknown
        Non-Volatile Size: None
        Volatile Size: 16 GB
        Cache Size: None
        Logical Size: None
:

There are seven more., but four say no module installed. So I seem to be using 4 memory modules and I could add up to four more (in pairs) if I were rich enough and needed more RAM.
ID: 66322 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,052
RAC: 15,294
Message 66336 - Posted: 10 Nov 2022, 11:58:05 UTC

FYI: final testing on the dev site is about to begin for an OpenIFS experiment ~5000 tasks. Expect them before end of Nov in production.
ID: 66336 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,052
RAC: 15,294
Message 66338 - Posted: 10 Nov 2022, 15:51:57 UTC - in response to Message 66337.  

I think we need a new thread dedicated to discussing hardware.. ;)
ID: 66338 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 42 · Next

Message boards : Number crunching : New work discussion - 2

©2024 cpdn.org