climateprediction.net (CPDN) home page
Thread 'UK Met Office HadAM4 at N144 resolution'

Thread 'UK Met Office HadAM4 at N144 resolution'

Message boards : Number crunching : UK Met Office HadAM4 at N144 resolution
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61743 - Posted: 20 Dec 2019, 0:39:55 UTC - in response to Message 61742.  

The L3 cache issue only applies to the N216, which are higher resolution.

You don't need an "app config" to limit processor cores. Use the option on your Account page under Computing preferences, called "Use at most nn % of the CPUs ".

The L3 cache seems to be fixed in the silicon, so that each quarter of the processors has it's own portion of L3.
I did find a picture of this once, but I don't remember where.

The i7-4770
If you look up the specs, you'll find that Intel chips have very little L3 cache.

The best way to find out what happens, is to do what several people described doing in the "N216" thread - try different combinations and see what happens.
ID: 61743 · Report as offensive     Reply Quote
alanb1951

Send message
Joined: 31 Aug 04
Posts: 37
Credit: 9,581,380
RAC: 3,853
Message 61744 - Posted: 20 Dec 2019, 7:00:06 UTC - in response to Message 61742.  
Last modified: 20 Dec 2019, 7:02:23 UTC

@Wolfman1360

Les is right about N144 (HadAM4) not being anywhere near as much of a "nuisance" as N216 (HadAM4h)! He's also right about the i7 and its cache limitations.

However, you mentioned running WCG work on the same system... You may or may not be aware of the issues regarding MIP1 and L3 cache (it likes about 4 or 5MB too!) Some of the other projects also work on quite large memory grids but don't seem to put so much stress on L3 cache (though the cumulative effect might mount up if you run several at once.)

Best WCG projects to run alongside CPDN are MCM1 and anything VINA-based (e.g. FAHV if/when it returns, SCC1 when it returns, and the recently completed OET1 and ZIKA projects); those tend to require less volume of L3 cache and their performance does not seem to degrade as dramatically (or put excessive stress on other applications). The worst is definitely MIP1!

Good luck deciding on a workload mix that suits you on that Ryzen, and don't try to run more than one HadAM4h or MIP1 on that 4770! (I have an i7-7700K (8MB L3 cache) and that shows severe performance degradation if I let it run more than one of either of those!!!)

Cheers - Al.

[Edited for sentence order...]
ID: 61744 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 81
Credit: 14,024,464
RAC: 5,225
Message 61745 - Posted: 20 Dec 2019, 7:21:03 UTC - in response to Message 61744.  

@Wolfman1360

Les is right about N144 (HadAM4) not being anywhere near as much of a "nuisance" as N216 (HadAM4h)! He's also right about the i7 and its cache limitations.

However, you mentioned running WCG work on the same system... You may or may not be aware of the issues regarding MIP1 and L3 cache (it likes about 4 or 5MB too!) Some of the other projects also work on quite large memory grids but don't seem to put so much stress on L3 cache (though the cumulative effect might mount up if you run several at once.)

Best WCG projects to run alongside CPDN are MCM1 and anything VINA-based (e.g. FAHV if/when it returns, SCC1 when it returns, and the recently completed OET1 and ZIKA projects); those tend to require less volume of L3 cache and their performance does not seem to degrade as dramatically (or put excessive stress on other applications). The worst is definitely MIP1!

Good luck deciding on a workload mix that suits you on that Ryzen, and don't try to run more than one HadAM4h or MIP1 on that 4770! (I have an i7-7700K (8MB L3 cache) and that shows severe performance degradation if I let it run more than one of either of those!!!)

Cheers - Al.

[Edited for sentence order...]

Thank you for the tips, this is exactly what I was looking for. I didn't want to limit cores on the project itself since I do want 100% of the CPU used, just not by CPDN.
Will I see much of a performance gain by allowing all cores to crunch away at N144 with hyper threading enabled, or will this just be twice the time for little if any gain?
Right now I have n144 limited to 8 concurrent on the Ryzen and 4 on the 4770 and 2600 thus theoretically illiminating HT, and 4 216 on the Ryzen with 2 allowed on the 2 intels. I'll change this to 1 per your advice, thank you. Rosetta is currently using the remaining 8 threads of the Ryzen and 4 of the other two. I've been a strong supporter of WCG since 2016 and need a bit of a break from there, though I will try and get some of the arp.
Once I get a new cooler for my fx8350 it will also be added. Nice space heater for the cold Canadian winter eh. :)

PS: Since I'm so very far out of the loop, where is a good place to learn the differences between HadAM4 and HadAM4h? Do they have similar references like SAM and EU at weather at home?
Sorry, I'm not phrasing that question properly at all. I forget how to decode each task - I know there was a method of figuring out specifics.
Regardless, feels fantastic contributing to a project I have a huge interest in. I've always been a huge weather enthusiast.
ID: 61745 · Report as offensive     Reply Quote
alanb1951

Send message
Joined: 31 Aug 04
Posts: 37
Credit: 9,581,380
RAC: 3,853
Message 61757 - Posted: 21 Dec 2019, 4:25:55 UTC - in response to Message 61745.  

@Wolfman1360
Will I see much of a performance gain by allowing all cores to crunch away at N144 with hyper threading enabled, or will this just be twice the time for little if any gain?
Right now I have n144 limited to 8 concurrent on the Ryzen and 4 on the 4770 and 2600 thus theoretically illiminating HT, and 4 216 on the Ryzen with 2 allowed on the 2 intels. I'll change this to 1 per your advice, thank you. Rosetta is currently using the remaining 8 threads of the Ryzen and 4 of the other two. I've been a strong supporter of WCG since 2016 and need a bit of a break from there, though I will try and get some of the arp.

Regarding cutting down the numbers - my comment about cache use for CPDN was specific to N216 tasks (which run HadAM4h - I presume the h is for high(er) resolution!) You would probably still be o.k. with 2 N144 tasks on an 8MB-cache machine most of the time, cutting down to one if more N216 turns up!

As for multiple tasks and hyperthreading - that one is a lot more complex, because unless you can guarantee which CPU thread runs a particular application you can't be sure whether you might get two heavy floating-point using applications on the same core (at which point you may well see a fairly substantial throughput drop per individual process, even if the applications aren't particularly hard on the L3 cache...) However, if you get lucky and one of the applications is mostly shuffling stuff around and doing pattern-matching or integer arithmetic a core may be able to keep both threads fairly busy!

I reckon on about a 15% drop on individual processes, but even then if I'm running 6 tasks on my 7700K or 14 tasks on my 3700X (instead of 4 or 8 respectively) I get more work done. If the drop was 40+ % I probably wouldn't use the hyperthreads...

Whatever your workload mix, there will probably be a point at which using more cores/threads will actually result in doing less science per hour - there was a thread over in the SETI forums about the best CPU workload on a Threadripper, and if I remember rightly the conclusion was that if more than 75% of the threads were used the performance drop became unacceptable. And I have noticed that if I don't leave two threads clear for operating system stuff on my 3700X performance takes a noticeable hit because of all the I/O services and such like causing extremely high numbers of cpu-migrations (expensive) as well as the expected context switches (reasonably efficient)...

I have lots of numbers about throughput for various workloads, but there really is no simple (or brief) answer to "what will happen if..." - you'll have to try things out. And if you are interested enough, and you have a Linux system on which you can get root access, you can get hold of the performance tools to check out throughput, context switches, cache access et cetera for yourself!

Good luck arriving at a workload mix that meets your objectives!

Cheers - Al.

P.S. My best machine for running individual processes is an i5-7600 (4 cores, no hyperthreading), which runs tasks about 10% faster than either my 7700K or 3700X despite being run at about 12% lower CPU clock speed. It only does WCG work but I only allow 1 MIP1, don't run ARP1 and use only 3 cores for BOINC.
ID: 61757 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 81
Credit: 14,024,464
RAC: 5,225
Message 61758 - Posted: 21 Dec 2019, 7:26:51 UTC - in response to Message 61757.  

@Wolfman1360
Will I see much of a performance gain by allowing all cores to crunch away at N144 with hyper threading enabled, or will this just be twice the time for little if any gain?
Right now I have n144 limited to 8 concurrent on the Ryzen and 4 on the 4770 and 2600 thus theoretically illiminating HT, and 4 216 on the Ryzen with 2 allowed on the 2 intels. I'll change this to 1 per your advice, thank you. Rosetta is currently using the remaining 8 threads of the Ryzen and 4 of the other two. I've been a strong supporter of WCG since 2016 and need a bit of a break from there, though I will try and get some of the arp.

Regarding cutting down the numbers - my comment about cache use for CPDN was specific to N216 tasks (which run HadAM4h - I presume the h is for high(er) resolution!) You would probably still be o.k. with 2 N144 tasks on an 8MB-cache machine most of the time, cutting down to one if more N216 turns up!

As for multiple tasks and hyperthreading - that one is a lot more complex, because unless you can guarantee which CPU thread runs a particular application you can't be sure whether you might get two heavy floating-point using applications on the same core (at which point you may well see a fairly substantial throughput drop per individual process, even if the applications aren't particularly hard on the L3 cache...) However, if you get lucky and one of the applications is mostly shuffling stuff around and doing pattern-matching or integer arithmetic a core may be able to keep both threads fairly busy!

I reckon on about a 15% drop on individual processes, but even then if I'm running 6 tasks on my 7700K or 14 tasks on my 3700X (instead of 4 or 8 respectively) I get more work done. If the drop was 40+ % I probably wouldn't use the hyperthreads...

Whatever your workload mix, there will probably be a point at which using more cores/threads will actually result in doing less science per hour - there was a thread over in the SETI forums about the best CPU workload on a Threadripper, and if I remember rightly the conclusion was that if more than 75% of the threads were used the performance drop became unacceptable. And I have noticed that if I don't leave two threads clear for operating system stuff on my 3700X performance takes a noticeable hit because of all the I/O services and such like causing extremely high numbers of cpu-migrations (expensive) as well as the expected context switches (reasonably efficient)...

I have lots of numbers about throughput for various workloads, but there really is no simple (or brief) answer to "what will happen if..." - you'll have to try things out. And if you are interested enough, and you have a Linux system on which you can get root access, you can get hold of the performance tools to check out throughput, context switches, cache access et cetera for yourself!

Good luck arriving at a workload mix that meets your objectives!

Cheers - Al.

P.S. My best machine for running individual processes is an i5-7600 (4 cores, no hyperthreading), which runs tasks about 10% faster than either my 7700K or 3700X despite being run at about 12% lower CPU clock speed. It only does WCG work but I only allow 1 MIP1, don't run ARP1 and use only 3 cores for BOINC.


This is very interesting stuff. Thank you. I'll experiment and see what works best.
All three of these machines (i7 2600 and 4770, Ryzen 1700x) have the bare minimum Linux install, and all they are used for is exclusively crunching, if this makes a difference at all.
Another question - doing more reading around the forum. I'm also seeing people give various numbers like 13 sec/TS, 20 sec/TS. How do I go about finding this and will this give a good indication of what works better and what does not?
Still trying to figure out how to decode this. I know there's a way of getting the various information I just seeming to forget what each number means.
hadam4_a1r8_209810_6_856_011963991

For instance, a1R8. I'm certain 856 is the current batch but apart from that I'm not sure.
PS: According to Boinc, on this Ryzen, the progress is right around 0.720% per hour.

And apparently, this one on the 4770, is climbing right around 1.08% per hour.
hadam4_a2ij_210010_6_853_011943294

Should I cut back on number of WUs being run at once on the Ryzen or is this par for the course? I'm assuming there are variations in each batch of tasks, too, so unless I get units from the same batch on different machines that would be the only fair way to judge?

Sorry if I'm delving too deeply into this or overthinking things.
ID: 61758 · Report as offensive     Reply Quote
lazlo_vii

Send message
Joined: 11 Dec 19
Posts: 108
Credit: 3,012,142
RAC: 0
Message 61761 - Posted: 21 Dec 2019, 18:52:05 UTC - in response to Message 61758.  
Last modified: 21 Dec 2019, 18:59:00 UTC

Wolfman, I am new here too so I can't offer a lot of advice yet. What I can say is:

The N144 models will send a "trickle update" to the servers every 16.66% of their total progress. Once you get a few trickles you can really start to judge your performance. That is where you can find your sec/TS stats for a model. I think sec/TS stands for "Seconds per Time Step" but I could be wrong. On my desktop system I run two jobs for CPDN and two jobs for LHC at home with each project running in it's own container. On my server I run six jobs for CPDN divided between two containers that each handle three jobs. If you look at my forum profile and then click on View you can see my containers are shown as regular computers. In that list climate1 and climate2 are the ones on my server and climate3 is on my desktop. You can then click on Tasks for any of my computers and see the work that is progress as well as the work that has finished. You can do the same with any forum member that doesn't check "Hide my Computers" in their profile setup. When looking at the stats of a model on similar hardware to yours bear in mind that each model is a unique simulation and some will run faster than others. You will want to look at quite a few to get an idea of the real averages are.

Another thing I would suggest is that you do not run jobs on 100% of your CPU. BOINC gives these tasks the lowest possible priority in Linux and running at 100% means that jobs will run slower because your CPU is preempting them so the OS can do it's tasks too. So try to leave one or two threads open for the OS to begin with and everything will run smoother.
ID: 61761 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 81
Credit: 14,024,464
RAC: 5,225
Message 61764 - Posted: 21 Dec 2019, 22:38:34 UTC - in response to Message 61761.  

Wolfman, I am new here too so I can't offer a lot of advice yet. What I can say is:

The N144 models will send a "trickle update" to the servers every 16.66% of their total progress. Once you get a few trickles you can really start to judge your performance. That is where you can find your sec/TS stats for a model. I think sec/TS stands for "Seconds per Time Step" but I could be wrong. On my desktop system I run two jobs for CPDN and two jobs for LHC at home with each project running in it's own container. On my server I run six jobs for CPDN divided between two containers that each handle three jobs. If you look at my forum profile and then click on View you can see my containers are shown as regular computers. In that list climate1 and climate2 are the ones on my server and climate3 is on my desktop. You can then click on Tasks for any of my computers and see the work that is progress as well as the work that has finished. You can do the same with any forum member that doesn't check "Hide my Computers" in their profile setup. When looking at the stats of a model on similar hardware to yours bear in mind that each model is a unique simulation and some will run faster than others. You will want to look at quite a few to get an idea of the real averages are.

Another thing I would suggest is that you do not run jobs on 100% of your CPU. BOINC gives these tasks the lowest possible priority in Linux and running at 100% means that jobs will run slower because your CPU is preempting them so the OS can do it's tasks too. So try to leave one or two threads open for the OS to begin with and everything will run smoother.

Fantastic. I see that on your profile now. Though the computer names don't show up, just the IDs. Those Ryzen's seem to be completing n144 in around 3 days or so on average. How do the containers work and is there any benefit to using them over contenventional Boinc under Ubuntu?
I'll set my Ryzen for 93% CPU usage so the OS has a thread left to use for itself and see how that goes. Unfortunately right now, in its odd scheduling wisdom, it appears that boinc has decided to let Rosetta take over crunching on just about everything. I've always wanted to give LHC a try. I'm just not sure what projects over there would be a good second to CPDN. I'm always leery of including multithreaded applications alongside single threaded. It never ends well and the Boinc scheduler gets even more confused with such drastically different runtimes.
Are the n216's different as to when they send a trickle? And I'm assuming WAH under Windows is likewise different, so will now keep a close eye on that too.
ID: 61764 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 61766 - Posted: 21 Dec 2019, 23:51:13 UTC - in response to Message 61764.  

Are the n216's different as to when they send a trickle? And I'm assuming WAH under Windows is likewise different, so will now keep a close eye on that too.

The n216's also trickle each model month, just like the n144's. However, it takes well over a day of computing time (or more depending on computer speed and number of models running on a PC) to finish a month and produce a trickle.
ID: 61766 · Report as offensive     Reply Quote
lazlo_vii

Send message
Joined: 11 Dec 19
Posts: 108
Credit: 3,012,142
RAC: 0
Message 61767 - Posted: 22 Dec 2019, 0:11:32 UTC - in response to Message 61764.  
Last modified: 22 Dec 2019, 0:13:48 UTC

Fantastic. I see that on your profile now. Though the computer names don't show up, just the IDs. Those Ryzen's seem to be completing n144 in around 3 days or so on average. How do the containers work and is there any benefit to using them over contenventional Boinc under Ubuntu?
I'll set my Ryzen for 93% CPU usage so the OS has a thread left to use for itself and see how that goes. Unfortunately right now, in its odd scheduling wisdom, it appears that boinc has decided to let Rosetta take over crunching on just about everything. I've always wanted to give LHC a try. I'm just not sure what projects over there would be a good second to CPDN. I'm always leery of including multithreaded applications alongside single threaded. It never ends well and the Boinc scheduler gets even more confused with such drastically different runtimes.
Are the n216's different as to when they send a trickle? And I'm assuming WAH under Windows is likewise different, so will now keep a close eye on that too.



The reason I started using containers was so I could avoid using the BOINC scheduler at all. There are some other big advantages over and above that too. However it adds a layer or two of complexity to monitoring and managing what your systems are doing. If you are new to Linux in general I would advise against trying to use them until you have set up a test environment and learned a little bit more about system administration. I am not saying that containers are hard (I find them easier than trying to get BOINC to do what I want) but I am saying that using them without frustration requires some prerequisite knowledge of Linux network configuration and using the command line. In the coming weeks I hope to post a small guide about using containers and BOINC on these forums. For now it is still in the planning phase.

You are correct in guessing that N144 and N216 jobs have different trickle points. You can get an answer to when trickles will occur for any job by looking at it's name:

N144 job name: hadam4_a0ul_209410_6_853_011941136_1

N216 job name: hadam4h_a1bz_201111_4_843_011909333_1

In the N144 job name the bolded number 6 signifies a job that models six months of weather, while the N216 job has the number 4 in the same position signifying a 4 month simulation. All models will send a trickle up after reaching a monthly milestone in the simulation. That means N216 models will trickle at every 25% (100% divided by 4) of the total simulation progress where the N144 trickle at every 16.66% (100% divided by 6). I do not remember what thread I read I that information in or I would I link to it for you.

I hope you find that info useful.
ID: 61767 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 81
Credit: 14,024,464
RAC: 5,225
Message 61775 - Posted: 22 Dec 2019, 19:24:01 UTC

Yes, that info is very useful - thank you both.
I'm definitely new to running much less managing Linux, so I'll just see where this gets me for now. I just know enough to get myself into or out of trouble - and sometimes not even that.
ID: 61775 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 61799 - Posted: 24 Dec 2019, 12:58:12 UTC - in response to Message 61758.  
Last modified: 24 Dec 2019, 12:59:26 UTC



This is very interesting stuff. Thank you. I'll experiment and see what works best.
All three of these machines (i7 2600 and 4770, Ryzen 1700x) have the bare minimum Linux install, and all they are used for is exclusively crunching, if this makes a difference at all.
Another question - doing more reading around the forum. I'm also seeing people give various numbers like 13 sec/TS, 20 sec/TS. How do I go about finding this and will this give a good indication of what works better and what does not?
Still trying to figure out how to decode this. I know there's a way of getting the various information I just seeming to forget what each number means.
hadam4_a1r8_209810_6_856_011963991

For instance, a1R8. I'm certain 856 is the current batch but apart from that I'm not sure.
PS: According to Boinc, on this Ryzen, the progress is right around 0.720% per hour.

And apparently, this one on the 4770, is climbing right around 1.08% per hour.
hadam4_a2ij_210010_6_853_011943294

Should I cut back on number of WUs being run at once on the Ryzen or is this par for the course? I'm assuming there are variations in each batch of tasks, too, so unless I get units from the same batch on different machines that would be the only fair way to judge?

Sorry if I'm delving too deeply into this or overthinking things.


As far as I can tell the a1R8 is the address of the cell being computed, next is the yyyymm of the start of the period then the number of months being worked. The last 3 would be the batch number, the item number within the batch and the number of previous fails for this item.

I’m not sure any of this will help working out a computation mix, I think that would have to be done at a higher level, say how many hadam4 against how many hadam4h with how many mip / mcm / rosetta4 / Rosetta mini / etc.
ID: 61799 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 81
Credit: 14,024,464
RAC: 5,225
Message 61801 - Posted: 24 Dec 2019, 19:27:51 UTC - in response to Message 61799.  

Thanks. That does answer my question.
I'll just experiment and see what works best for my specific setup, I suppose.
ID: 61801 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 61802 - Posted: 24 Dec 2019, 19:46:51 UTC

Ok, now for a silly question born of ignorance. Would the total number of calls to L3 cache be a reliable indicator of the amount of science being done across a mix of work units?

I would guess it would be within a given flavour of WU but I’m not convinced it will be when the mix changes.

My thinking is that I can get an instant(ish) reading of llc hits and misses and if I map the sum against time for the accumulation (usually I give it around 30 seconds) with a record of what was running it might indicate good and bad mixes.

Otherwise I’m reduced to watching daily totals and with maybe 100 work units a day going through the mill I’d have to change my preferences to force one mix and then another to compare them, not a good solution.
ID: 61802 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61803 - Posted: 24 Dec 2019, 19:53:02 UTC

The L3 cache only applies to the higher resolution N216 models, which is why we didn't discuss it in this thread.
ID: 61803 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 81
Credit: 14,024,464
RAC: 5,225
Message 61804 - Posted: 24 Dec 2019, 21:11:53 UTC

Christmas eve is very quiet here, so I dug through my parts collection and, assuming all goes well, will be adding an old 130 w Xeon w3520 to the project. Since l3 cache isn't an issue with the n144, I figure it can't hurt to add yet another machine - the Windows tasks seem to go like flies but Linux, on the other hand, seems to be lacking in machines.

Just an open motherboard, power supply, ram and old I think 2 tb disk on the table to save on space....I think I have an i7 from the same era, a 920 or 950, but am lacking a cooler and extra ram for it.
ID: 61804 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 61805 - Posted: 24 Dec 2019, 23:08:02 UTC - in response to Message 61803.  

The L3 cache only applies to the higher resolution N216 models, which is why we didn't discuss it in this thread.


Do not understand, surely any program that calls for data not held within the code will firstly look in the L3 cache? Are you suggesting that only the N216 work units hold data on disk?
ID: 61805 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61807 - Posted: 25 Dec 2019, 0:30:42 UTC - in response to Message 61805.  

Back at the time I started the "other thread", I asked the project people if intense use was also made of the L3 cache for the N144 models, and was told No.
Which is why the more recent AMD Ryzen processors do better with the N216 models. They have a larger L3 cache, and the Intel processors only have small L3 caches.

The disk cache has nothing to do with it.
ID: 61807 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61814 - Posted: 25 Dec 2019, 15:04:58 UTC - in response to Message 61807.  

Back at the time I started the "other thread", I asked the project people if intense use was also made of the L3 cache for the N144 models, and was told No.
Which is why the more recent AMD Ryzen processors do better with the N216 models. They have a larger L3 cache, and the Intel processors only have small L3 caches.

(1) That is why allowing users to select the projects would be helpful. We can run more N144 than N216 on most machines, but have to limit them to the N216 case.
(2) What about OpenIFS? I would like to run as many as my main memory permits, but if we are limited by cache, that may not be feasible.
ID: 61814 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 61839 - Posted: 26 Dec 2019, 18:09:13 UTC

I was interested in what effect memory speed/bandwidth might have on the N144 models. While this is not a complete, or ideal test, I ran two N144 models on my 4790K and changed the memory speed to calculate the difference in sec/TS based on the changes. The PC was running at 4.4 GHz through all tests and had DDR3 2400 capable memory in it.

1600 MHz 8.23 sec/TS
1866 MHz 7.80 sec/TS (5.2% faster than 1600 MHz)
2133 MHz 7.58 sec/TS (7.9% faster than 1600 MHz)
2400 MHz 7.45 sec/TS (9.5% faster than 1600 MHz)

How this would change running 4 at a time, or running 2 or more N216 models, is unknown at this time. I was just curious.
ID: 61839 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61840 - Posted: 26 Dec 2019, 23:12:46 UTC
Last modified: 26 Dec 2019, 23:13:30 UTC

FWIW, since I can't choose the projects via the CPDN preferences, I can at least choose how many to run at a time.
Here is an "app_config.xml" file that should work for a starter. It limits N144 to six at a time, and N216 to four.

<app_config>

<app>
    <name>hadam4</name>
    <user_friendly_name>UK Met Office HadAM4 at N144 resolution</user_friendly_name>
    <max_concurrent>6</max_concurrent>
</app>

 <app>
   <name>hadam4h</name>
   <user_friendly_name>UK Met Office HadAM4 at N216 resolution</user_friendly_name>
   <max_concurrent>4</max_concurrent>
  </app>
 
</app_config>

As most (but not everyone) knows, you create it in a text editor, save it as an ".xml" file, and place it in the CPDN project folder.
Then you activate it by restarting BOINC. I am trying it out now, and don't see any mistakes thus far, but let me know.

It will do for a start, depending on the CPU. When OpenIFS comes along, its name will be found in the "client_state.xml" file, and can be added accordingly.
ID: 61840 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : UK Met Office HadAM4 at N144 resolution

©2024 cpdn.org