Message boards : Number crunching : UK Met Office HadAM4 at N216 resolution
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Under my i7-4790 I run only 4 N216s, they checkpoint every 38-40 minutes, 30-31 sec/TS (12,5 days to complete). Some WUs reached 39 sec/TS, but at that time was running 6 or 8 cores with WCG along. So for the moment I do not go over 4 real cores. Reading on the other thread even with RYZEN 3600 (6C, 12T) going beyond 4-5 WUs decreases performance a lot. Completion time seems faster though. On my other machine with i7-3520M I run one N216 and one WCG. The N216 speed was 24-sec/TS and completed in 10 days. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Yey I got one from batch 843. The task timed out after one year no response. I wonder if it is of any use except for upping my points. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
There's been quite a few fails, and several hundred still running, (possibly not for the first time), so if you put your foot down and go for it, you're in with a chance. :) |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
There's been quite a few fails, and several hundred still running, (possibly not for the first time), so if you put your foot down and go for it, you're in with a chance. :) Good then, I will let it run. Thanks. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,026,382 RAC: 20,431 |
And I have just picked up one from #843 as well. (On its fifth and final attempt. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
And I have just picked up one from #843 as well. (On its fifth and final attempt. I also got another one but from #842. On its second attempt after a whole year with no response. I still think deadlines should be shortened. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,000,748 RAC: 14,638 |
! got 1 from 843 as well after it had been dormant for a year - but it failed almost immediately with a REPLANCA error :(( |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,026,382 RAC: 20,431 |
I have now got 8 retreads from 843 and 842. Have set to no new tasks now as running more than the number of real cores slows things down too much. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
I also got 5 of the new ones. So with 6 N216 my /var climbed to ~ 16 GB. With 4 WCG ARP in the queue I almost ran out of space on /var ~20GB and BOINC manager crashed. I needed to clean some journals. Luckily no CPDN models crashed due to the low disk issue. With reducing work to real cores and cleaning ARPs will get things back to normal. |
©2024 cpdn.org