Thread 'UK Met Office HadAM4 at N216 resolution'

Author	Message
Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4559 Credit: 19,039,635 RAC: 18,944	Message 61311 - Posted: 21 Oct 2019, 9:44:59 UTC - in response to Message 61306. It looks like the cache is the culprit.. This will slow down those 64 and 128 core machines. Unless they're just crashing them because of the missing lib. Heaven forbid that lack of cache memory should slow down their crashing! ID: 61311 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1121 Credit: 17,202,915 RAC: 2,154	Message 61312 - Posted: 21 Oct 2019, 11:23:57 UTC - in response to Message 61306. I wonder why they put so much cache in my relatively slow Processor. I am glad they did. My kernel is not all that old: 2019 Sep 17 09:53 vmlinuz-2.6.32-754.23.1.el6.x86_64 CPU type GenuineIntel Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz [Family 6 Model 45 Stepping 7] Number of processors 4 Operating System Linux 2.6.32-754.23.1.el6.x86_64 BOINC version 7.2.33 Memory 15.5 GB Cache 10240 KB Swap space 3.91 GB Total disk space 117.21 GB Free Disk Space 103.25 GB Measured floating point speed 1.27 billion ops/sec Measured integer speed 3.52 billion ops/sec Average upload rate 2174.4 KB/sec Average download rate 9265.96 KB/sec ID: 61312 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61313 - Posted: 21 Oct 2019, 12:12:11 UTC Last modified: 21 Oct 2019, 12:41:37 UTC I am trying a single N216 on my Ryzen 3700x, and after 3 1/2 hours the estimated completion time is 7.5 days. That is pretty good, though how many I will be able to run is to be determined. The good news is that people won't need so much memory, especially for the Open IFS. They will be limited by the cache. (I have to clear out some WCG work before I get back to it in a couple of days. That includes a few MIP1, and I don't want them to interfere.) ID: 61313 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1121 Credit: 17,202,915 RAC: 2,154	Message 61323 - Posted: 22 Oct 2019, 2:14:16 UTC Do N216 work units make trickles? If so, when? My present work unit is 18.96% complete, having run for 101 hours with 257 hours to complete (if you believe these numbers). Last checkpoint was at about 92 hours. It has not attempted to send any trickles yet. name hadam4h_a0pg_200811_4_842_011905372 Task 21760249 Workunit 11905372 ID: 61323 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61325 - Posted: 22 Oct 2019, 2:18:26 UTC After a bit of experimenting, I find that setting "use at most 50% of the processors" works best for me on both the i7-8700 and the i7-9700. That means six virtual cores on the i7-8700 and four full cores on the i7-9700 (they both have 12 MB L3 cache). The fast way to find how much work is being done is just check the writes to disk; the more the writes, the more the work done. I use "iostat -m 7200" to measure the writes (in MB) over a two-hour (7200 second) period. If you have not used iostat before, you first run "sudo apt install sysstat" to install it. This implies that with its 32 MB of L3 cache, the Ryzen 3700x should run best on 8 cores, but I would check it to be sure. ID: 61325 · Reply Quote

geophi Volunteer moderator Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275	Message 61326 - Posted: 22 Oct 2019, 2:26:11 UTC - in response to Message 61323. Do N216 work units make trickles? If so, when? Yes. The 4 in the task name means a 4 month model so it will trickle after every month, at 25% Progress. Yep...a long time between checkpoints and a long time between trickles. This was brought up during testing but they kept this as with the other models, one checkpoint per model day and one trickle per model month. ID: 61326 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1121 Credit: 17,202,915 RAC: 2,154	Message 61333 - Posted: 22 Oct 2019, 12:36:55 UTC - in response to Message 61326. Do N216 work units make trickles? If so, when? Yes. The 4 in the task name means a 4 month model so it will trickle after every month, at 25% Progress. Thank you. I was beginning to wonder if something was wrong. P.s.: I am not requesting any change. ID: 61333 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61337 - Posted: 22 Oct 2019, 14:32:38 UTC My i7-4790 running on four cores (50% of processors) is taking 12 days per work unit. But that reminds me that a simple way of estimating the proper number of cores to use is to look at the CPU % (as in BOINC tasks). When you are operating properly in the cache, it will be high, up around 99% or so. But if you are running too many work units, then it will take a hit down to 70% or so. There are more accurate ways of determining what is going on, but that is a quick estimate. ID: 61337 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 61365 - Posted: 24 Oct 2019, 5:31:37 UTC My 8 have just reached halfway, after just under 7 days. ID: 61365 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61369 - Posted: 24 Oct 2019, 12:37:36 UTC Last modified: 24 Oct 2019, 12:38:18 UTC I am having a hard time getting consistent readings using the write-to-disk method. Even with a two hour monitoring period, I get large variations from 300 MB_wrtn to 4000 MB_wrtn on my Ryzen 3700X when running on 9 cores (and see about the same variation on 7 cores). So I will just go with 8 cores, with estimated completed times of about 15 days. Overall, based on estimated completion times, running on my i7-9700 with 4 cores (50%) does the best, with completion times about 7 days. I think the 32 GB of memory in it will be enough for the OpenIFS, which is convenient. ID: 61369 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1121 Credit: 17,202,915 RAC: 2,154	Message 61373 - Posted: 24 Oct 2019, 16:25:08 UTC - in response to Message 61369. Do you use iostat? The first time it runs it gives the usage since the system was booted. Then the amounts in subsequent intervals. My machine runs boinc all the time it is up, but I booted it almost 5 days ago, so the totals are relatively small. All boincdata (including programs), and only boinc, are on /dev/sdd1. Boinc client has been running two hadcm3s, one hadam4, and one hadam4h the whole time. There are two other partitions on that drive one of which can be busy if I choose to watch videos. Normally, one would use a greater interval than 60 seconds for this, and more than two outputs. $ iostat -p sdd -k 60 2 Linux 2.6.32-754.23.1.el6.x86_64 (DellT7600.localdomain) 10/24/2019 _x86_64_ (4 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 3.65 95.38 0.86 0.06 0.00 0.05 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 13.21 22.59 617.35 9491521 259414857 sdd1 0.01 1.34 0.00 564356 560 sdd2 0.00 0.01 0.00 2149 45 sdd3 13.20 21.24 617.34 8924764 259414252 avg-cpu: %user %nice %system %iowait %steal %idle 3.52 93.55 1.51 1.37 0.00 0.05 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 93.32 0.33 10521.40 20 631284 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 93.32 0.33 10521.40 20 631284 ID: 61373 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61374 - Posted: 24 Oct 2019, 17:04:09 UTC - in response to Message 61373. Last modified: 24 Oct 2019, 17:05:26 UTC Do you use iostat? The first time it runs it gives the usage since the system was booted. Then the amounts in subsequent intervals. Yes, I was running "iostat -m 7200" (two hours), and disregarded the first one. The next two gave around 300 MB, and only the third gave a reasonable value of around 4000 MB_wrtn. So I have to conclude that the work units vary a lot in what they write over time. I would probably have to do a 24-hour interval to get a reliable measurement, and in that amount of time I can just calculate it from the run time and % completed. I don't need perfect accuracy, just enough to decide which machine to use, and how many cores. I think I have that, for this set of work anyway. ID: 61374 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1121 Credit: 17,202,915 RAC: 2,154	Message 61380 - Posted: 25 Oct 2019, 13:38:34 UTC - in response to Message 61374. I had to reboot my machine last night (problems with SELinux OS), and I then started running iostat hourly for 24 hours (not yet completed). Here are the results so far. /dev/sdd3 is the partition with all boinc, and only boinc, init. (I said something else the other day, but that was a mistake). I have to conclude that the work units vary a lot in what they write over time. I would probably have to do a 24-hour interval to get a reliable measurement, and in that amount of time I can just calculate it from the run time and % completed. I do not notice this. My machine is a 1.8 GHz 64-bit Xeon with 10240 KBytes of Cache. It is running four ClimatePrediction work units, two hadcm3s, one hadam4, and one hadam4h, and no other boinc work units lately. $ iostat -p sdd -tk 3600 24 Linux 2.6.32-754.23.1.el6.x86_64 (DellT7600.localdomain) 10/25/2019 _x86_64_ (4 CPU) 10/25/2019 12:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 2.61 27.42 1.76 4.21 0.00 64.00 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 48.84 4791.36 431.42 2732513 246041 sdd1 1.28 4.71 0.01 2688 8 sdd2 1.05 3.73 0.02 2129 9 sdd3 46.41 4782.49 431.39 2727452 246024 10/25/2019 01:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 4.65 94.08 1.16 0.05 0.00 0.06 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 12.24 4.36 527.88 15684 1900384 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 12.24 4.36 527.88 15684 1900384 10/25/2019 02:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 1.39 98.16 0.39 0.04 0.00 0.02 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 13.39 0.17 616.22 608 2218392 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 13.39 0.17 616.22 608 2218392 10/25/2019 03:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 0.05 98.18 1.70 0.05 0.00 0.01 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 13.23 13.68 602.85 49260 2170276 sdd1 0.01 0.07 0.00 256 0 sdd2 0.00 0.00 0.00 0 0 sdd3 13.22 13.61 602.85 49004 2170276 10/25/2019 04:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 3.54 94.81 1.57 0.04 0.00 0.04 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 12.90 0.97 593.14 3488 2135312 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 12.90 0.97 593.14 3488 2135312 10/25/2019 05:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 3.99 94.81 1.10 0.06 0.00 0.04 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 13.83 4.71 706.16 16972 2542168 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 13.83 4.71 706.16 16972 2542168 10/25/2019 06:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 3.78 95.10 1.03 0.04 0.00 0.05 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 13.04 1.69 591.13 6080 2128076 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 13.04 1.69 591.13 6080 2128076 10/25/2019 07:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 2.97 96.05 0.88 0.06 0.00 0.03 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 12.90 0.13 595.69 476 2144480 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 12.90 0.13 595.69 476 2144480 10/25/2019 08:43:51 AM avg-cpu: %user %nice %system %iowait %steal %idle 2.85 96.25 0.82 0.04 0.00 0.04 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sdd 12.68 1.38 582.83 4984 2098184 sdd1 0.00 0.00 0.00 0 0 sdd2 0.00 0.00 0.00 0 0 sdd3 12.68 1.38 582.83 4984 2098184 ID: 61380 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61381 - Posted: 25 Oct 2019, 15:09:34 UTC - in response to Message 61380. Last modified: 25 Oct 2019, 15:11:58 UTC I do not notice this. My machine is a 1.8 GHz 64-bit Xeon with 10240 KBytes of Cache. It is running four ClimatePrediction work units, two hadcm3s, one hadam4, and one hadam4h, and no other boinc work units lately. I am running only the hadam4h now (N216), and did not notice the variation on the other ones either, which are considerably smaller. You have a lot of cache per core also, which may reduce any variations. I think we need to check on each machine to see what works. ID: 61381 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4559 Credit: 19,039,635 RAC: 18,944	Message 61382 - Posted: 25 Oct 2019, 16:15:04 UTC What the long time between checkpoints does mean for these tasks is that on computers that get switched off several times a day will never finish because if they have not reached the first checkpoint they will restart from the beginning. If your computer is one of these, please use suspend either to RAM or to disk instead of just switching off. ID: 61382 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1121 Credit: 17,202,915 RAC: 2,154	Message 61383 - Posted: 25 Oct 2019, 19:31:28 UTC - in response to Message 61382. Last modified: 25 Oct 2019, 19:34:21 UTC What the long time between checkpoints does mean for these tasks is that on computers that get switched off several times a day will never finish because if they have not reached the first checkpoint they will restart from the beginning. If your computer is one of these, please use suspend either to RAM or to disk instead of just switching off. I try to run my machine 24/7 for about a month at a time. Basically, whenever Red Hat send me a new OS kernel. But once in a while I need to do it more often. I had not thought about the problem of restarting a work unit with such a long interval between checkpoints. My last checkpoint was 175:33:54 ago. I did not think about the effect of restarting a long time after a checkpoint, since my default interval is 600 seconds (not applicable here). But I did set no new tasks for any project, and suspend all the ClimatePrediction work units. I then shut down the boinc client before rebooting my machine. Just because of that problems with the UK Met Office HadAM4 at N144 resolution v8.08 i686-pc-linux-gnu work units. ID: 61383 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4559 Credit: 19,039,635 RAC: 18,944	Message 61390 - Posted: 26 Oct 2019, 7:58:43 UTC Just checked, suspending computation and shutting BOINC down does not stop this. You do need to use suspend rather than a hard reboot. ID: 61390 · Reply Quote

alanb1951 Send message Joined: 31 Aug 04 Posts: 38 Credit: 9,581,380 RAC: 3,853	Message 61392 - Posted: 26 Oct 2019, 9:44:51 UTC - in response to Message 61308. In reply to Jim1348 in re Ryzen 3700X I'm about to take delivery of a Ryzen 3700X (32MB L3 cache, though I gather access is constrained to 8MB per 2 cores (4 threads)); I'll be interested to see how that behaves as and when it gets some CPDN work to do (and will probably do some bulk tests with WCG MIP1 to get an idea if there's no CPDN work available!) Cheers - Al. [1] Someone over at WCG seemed to think 5MB cache was what a MIP1 job would like. The user offered no justification for that number but 4MB probably isn't enough for near-optimum performance. Thanks a lot for the cache info. I was beginning to think that the issues were deeper than I had found. I just happen to have a Ryzen 3700x, and was wondering what its large L3 cache would do here. But I would need to add more memory. So let us know, and I could do it. I have finally got the beast up and running on Ubuntu 18.04-3 (kernel 5.0.0-32-generic). It has 32GB of 3200MHz RAM, boots from an NVMe SSD, and I've put /var on HDD RAID 1 so that logs and checkpoint files aren't hammering the SSD. (/home is on RAID as well - all my non-laptop builds are done like that...) I haven't done any tuning apart from making sure that the memory clock and fabric clock are fixed at 3200 and 1600 respectively. It has taken until now to get a decent work-load built up; I'm currently running 12 WCG tasks (with a check to stop MIP1 from running more than two at a time) and 2 CPDN HadAM4h tasks at a time, along with one GPU task from SETI@Home, Einstein@Home or MilkyWay@Home, so the system is getting a fair work-out. It seems to have all clocks at about 3.95GHz, and the machine is drawing about 140W not counting the GPU. As regards checkpointing and completion times, after the first checkpoint (which seems to cover a few more time steps) it seems to checkpoint about every 60 minutes. I haven't had one generate a trickle yet but at current rate of progress I expect a trickle at about 33 hours 20 minutes, and the tasks to finish in about 5 days 13 hours. I'm going to let the machine run with that sort of work load for a while to make sure it's behaving consistently, and I plan on doing some experiments with more HadAM4h tasks running at once when the current two have finished. It might be interesting to find out how many I can run at once without serious degradation it the only other work on the machine is WCG MCM1 (which is very cache-friendly!) I'll try to do some task-level performance stats at some point, but on AMD CPUs there's no direct way of getting a count of L3 cache misses (I think it counts them at the cache level rather than the CPU level...) so one key stat isn't available. Ah, well... Hope this was of interest - Al. ID: 61392 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61393 - Posted: 26 Oct 2019, 11:29:10 UTC - in response to Message 61392. Last modified: 26 Oct 2019, 11:48:51 UTC I'm currently running 12 WCG tasks (with a check to stop MIP1 from running more than two at a time) and 2 CPDN HadAM4h tasks at a time, along with one GPU task from SETI@Home, Einstein@Home or MilkyWay@Home, so the system is getting a fair work-out. I am finding that my Ryzen 3700x begins to fall off a cliff of sorts beyond three HadAM4h (N216). Above that, the write-rate gets erratic, and begins to fall off. So I will use an app_config.xml to limit it to three, and run WCG on the other cores. That is a bit surprising, with the 32 MB L3 cache, so there is some other limiting factor. EDIT: Of course, with the WCG also running, I may need to limit the HadAM4h even more, down to two. I will be monitoring it for a while. ID: 61393 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61408 - Posted: 27 Oct 2019, 16:36:03 UTC Last modified: 27 Oct 2019, 17:17:34 UTC I just completed my first four on my i7-9700. They all went swimmingly, completing in a little over 7 days. I ended up on four cores, but they initially ran on eight. The next group of four will be the same, but the one after that may be a little faster. https://www.cpdn.org/results.php?hostid=1493890 But there will be more of a learning curve on this one for what works for each machine. I hope it holds true for OpenIFS too, or we have to start all over. ID: 61408 · Reply Quote