Message boards : Number crunching : New work discussion - 2
Message board moderation
Previous · 1 . . . 30 · 31 · 32 · 33 · 34 · 35 · 36 . . . 42 · Next
Author | Message |
---|---|
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,935 RAC: 21,606 |
p.s. Still running after 4.5 hours, but fingers and toes still crossed for the next 9 days of run time which is more like a couple of weeks clock time.Looks like mine at any rate are a bit pessimistic on the estimates. Probably from running the EAS tasks that are dealing with more data because of covering a larger and more complex area. |
Send message Joined: 2 Oct 06 Posts: 54 Credit: 27,309,613 RAC: 28,128 |
Deadlines for this new batch are still a year out. Not good. |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,734,370 RAC: 4,337 |
Well it survived last night's shutdown and this morning's restart. Onwards and upwards - ~6% done in ~8 hours. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,935 RAC: 21,606 |
Well it survived last night's shutdown and this morning's restart. Onwards and upwards - ~6% done in ~8 hours.Mine have all survived as have my five from the East Asia batch. The latter twice now. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,935 RAC: 21,606 |
Deadlines for this new batch are still a year out. Not good.This has been raised numerous times with the project. I doubt if moaning about it will make any difference. |
Send message Joined: 29 Oct 17 Posts: 1048 Credit: 16,431,665 RAC: 17,512 |
They have more important things right now. It's also something of a non-issue because once enough results are in CPDN usually close the batch stopping any more resends. But I'll look in the repository and see if I can change it.Deadlines for this new batch are still a year out. Not good.This has been raised numerous times with the project. I doubt if moaning about it will make any difference. --- CPDN Visiting Scientist |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
They have more important things right now. It's also something of a non-issue because once enough results are in CPDN usually close the batch stopping any more resends. But I'll look in the repository and see if I can change it.Thanks. Pity you can't send out cancels for already running tasks when they have enough data, but that's probably impossible, because of some insane Boinc policy about not upsetting folk who'd somehow prefer to crunch something pointless just because they started it. Are these new ones supposed to be so small? I'm running them about 5 times faster. |
Send message Joined: 22 Feb 11 Posts: 32 Credit: 226,546 RAC: 4,080 |
What will ram consumption be for one multicore OpenIFS task compared to multiple workunits of singlecore OpenIFS tasks? Also how effective is hyperthreading? |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,935 RAC: 21,606 |
Also how effective is hyperthreading? There is a thread somewhere where Glen has answered that one. I will see if I can find it later. |
Send message Joined: 29 Oct 17 Posts: 1048 Credit: 16,431,665 RAC: 17,512 |
Hyperthreading does not work for very numerical codes like weather models. You get contention on the chip where threads try to access the floating pt units of which there are only single units available. There's also diminishing returns on access to memory too (large caches like the AMD X3D have little impact).Also how effective is hyperthreading?There is a thread somewhere where Glenn has answered that one. I will see if I can find it later. If you want best throughput (# tasks complete/per day), keep the task count to the same number of cores. If you want the fastest runtime, only run 1 task and keep the machine as quiet as possible. The post that Dave refers to includes a graph where I tested running OpenIFS on different numbers of cores. You can find the thread here: https://www.cpdn.org/forum_thread.php?id=9184#68081. The results apply to the WAH and other MetOffice models. It's raw single core speed and memory bandwidth which get the best with CPDN models. Edit: Re: RAM consumption. Memory only marginally increases with multicore (say ~5-10%) but runtime decreases in line with increasing cores (i.e. half runtime with 2 cores) (and we don't have any multicore models in production yet). --- CPDN Visiting Scientist |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,935 RAC: 21,606 |
Are these new ones supposed to be so small? I'm running them about 5 times faster.I am not seeing anything like a five times increase in speed but I remember you saying mine were running faster than yours when they should have been going at a comparable speed. However as they are covering a smaller and less complex area I would expect them to run faster. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
I am not seeing anything like a five times increase in speed but I remember you saying mine were running faster than yours when they should have been going at a comparable speed. However as they are covering a smaller and less complex area I would expect them to run faster.I'm seeing 16% done in 18 hours on a Ryzen 9 3900X (my good one with fast dual channel RAM) - predicted one task takes just under 5 days. It's only doing 2 (alongside other projects which fill all 24 threads), and I've told Boinc to allocate 2 threads to each of the CPDN tasks. I guess the faster RAM accounts for x3. And doing 2 instead of 12 will help aswell. Ok, I've no idea how much speed increase is caused by the tasks! How much smaller is the area? And is the resolution the same? |
Send message Joined: 22 Feb 11 Posts: 32 Credit: 226,546 RAC: 4,080 |
What if you use <app_config> <app> <name>wah2</name> <max_concurrent>2</max_concurrent> <fraction_done_exact/> </app> </app_config> |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
What if you useI prefer to run one per core for the most throughput, not the fastest per task, so I've told it to use two threads per task, thus: <app_version> <app_name>wah2</app_name> <plan_class></plan_class> <cmdline></cmdline> <avg_ncpus>2.000000</avg_ncpus> <ngpus>0.000000</ngpus> </app_version> |
Send message Joined: 22 Feb 11 Posts: 32 Credit: 226,546 RAC: 4,080 |
To be sure that other projects won't take cores for themselves? |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,696,681 RAC: 10,226 |
Just come out of a 7-hour powercut - sharp cliff-edge drop, no flickering or brownouts. Both climate models I had running - on different machines - restarted unscathed. One was approaching a trickle+upload: and those went through cleanly as well. Mind you, I had plenty of time to prepare for an orderly restart - turned router and computers off, and started them one at a time in a sensible sequence, so each had access to whatever services they needed, notably DHCP, as they went live. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
To be sure that other projects won't take cores for themselves?Seems to work without doing that. Not sure exactly how Windows allocates things, but the other projects are happy with HT, so I was assuming they took a thread each, and CPDN took a core each. I guess even if Windows isn't too bright (a fair assumption!) with a small number of CPDNs, the chances of them going on the same core are minimal. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Just come out of a 7-hour powercut - sharp cliff-edge drop, no flickering or brownouts.No UPS? Dear me! Both climate models I had running - on different machines - restarted unscathed. One was approaching a trickle+upload: and those went through cleanly as well. Mind you, I had plenty of time to prepare for an orderly restart - turned router and computers off, and started them one at a time in a sensible sequence, so each had access to whatever services they needed, notably DHCP, as they went live.Why would you need to do that? The computers should remember their last IP address, and if they need the internet but it's not ready yet, they'll retry. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Just come out of a 7-hour powercut - sharp cliff-edge drop, no flickering or brownouts. I have a UPS that is good for about 13 minutes right now with 12 Boinc tasks running and the monitor on.. But I have a natural gas operated backup generator that comes on within about 10 or 12 seconds and will run as long as the gas company does its duty. The longest power interruption I have experienced here was about 6 1/2 days related to tropical storm Sandy. Most of my interruptions are much shorter. Like one second. I did get about a 2 1/2 hour interruption around Christmas that the power company had to fix. But it did not mess up my computer. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
I have a UPS that is good for about 13 minutes right now with 12 Boinc tasks running and the monitor on.. But I have a natural gas operated backup generator that comes on within about 10 or 12 seconds and will run as long as the gas company does its duty. The longest power interruption I have experienced here was about 6 1/2 days related to tropical storm Sandy.I think I've had a single 1 hour cut in 23 years here. But I used to get a lot of 1 second dips which could crash a computer. When I got a corrupted C drive which would no longer boot, I got a UPS. I was then surprised to find it was constantly adjusting the voltage, as it kept getting too high. It was outside the legal specs, and I reported it, but they said if they lowered my voltage (I'm next to the transformer), the far end of the street would be too low. Seems to have stopped now they've replaced the transformer. Not sure if that was because of complaints, new housing being built, it getting old (it didn't look it), it got damaged due to a short, or the solar panels going up everywhere. |
©2024 cpdn.org