Message boards : Number crunching : New Work Announcements 2024
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 13 · Next
Author | Message |
---|---|
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
There shouldn't be but it does happen. I've seen a bug in the intel maths library once which caused differences. I forget the details now as it was some time ago, but I vaguely remember it was related to the way it handled memory if vector lengths didn't fit entirely into cache caused summing numbers in different orders. Anyway, it's worth checking.The aim is to see how much variation we get in running multiple identical forecasts across all the linux machines attached to CPDN, and, if we get the same result from exact same forecasts from each host (which is not a given).Interesting. There... shouldn't be any variation in results for the same code on the same host with the same initial conditions. If so, look for uninitialized memory reads somewhere, I guess? I know floating point is messy, but it should at least be consistently messy. --- CPDN Visiting Scientist |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Do we need to set up some parameters on Linux boxes, to avoid downloading and running too many OIFS at the same time? |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,016,442 RAC: 21,024 |
Do we need to set up some parameters on Linux boxes, to avoid downloading and running too many OIFS at the same time? Certainly not on any machine with a reasonable amount of memory as there will be a limit of either one or two from the server. In theory there could be problems with a machine with just 16GB getting two at once if also running other tasks if the project limit is 2 rather than 1 but the majority of machines should be fine. The major problems as always will come not from those who read the noticeboards but from the set and forget brigade. |
Send message Joined: 14 Sep 08 Posts: 127 Credit: 41,744,071 RAC: 63,130 |
The major problems as always will come not from those who read the noticeboards but from the set and forget brigade. Will there be a preference setting that one can override for people that actively monitor the output and have bigger machines? |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
What kind of preference setting were you thinking of? The major problems as always will come not from those who read the noticeboards but from the set and forget brigade. --- CPDN Visiting Scientist |
Send message Joined: 5 Aug 04 Posts: 178 Credit: 18,746,186 RAC: 44,617 |
There shouldn't be but it does happen. I've seen a bug in the intel maths library once which caused differences. I forget the details now as it was some time ago, but I vaguely remember it was related to the way it handled memory if vector lengths didn't fit entirely into cache caused summing numbers in different orders. Anyway, it's worth checking. Yes, if you want to get more info about this, I remember that this was a huge point for the guys at LHC@Home from Sixtrack application, especially Ben Segal. Perhaps you can discuss with them about this special theme. Further on I guess this is the reason, why they run all other projects only within Linux-native or Linux-VMs Supporting BOINC, a great concept ! |
Send message Joined: 14 Sep 08 Posts: 127 Credit: 41,744,071 RAC: 63,130 |
What kind of preference setting were you thinking of? I read "there will be a limit of either one or two from the server" as even if someone's cpdn project preference set max# of jobs to "no limit", they will still get limited to 1 or 2 per host for OpenIFS tasks. So I wonder how one can get more tasks for the host if it has a lot of memory, without doing multi-client or VMs. If I remembered wrong and the default max# of jobs in preference is 1 or 2 and you will continue to honor that setting if it's set to "no limit", then whatever setting I was asking for already exists. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
I second this, I have machines with up to 128GB RAM. Limiting those to the same number of tasks as 16GB machines is illogical.What kind of preference setting were you thinking of?I read "there will be a limit of either one or two from the server" as even if someone's cpdn project preference set max# of jobs to "no limit", they will still get limited to 1 or 2 per host for OpenIFS tasks. So I wonder how one can get more tasks for the host if it has a lot of memory, without doing multi-client or VMs. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
I second this, I have machines with up to 128GB RAM. Limiting those to the same number of tasks as 16GB machines is illogical. Me too. My Windows 10 machine has about 16 GBytes of RAM (total), but my Linux machine has 128 GBytes of RAM. The Linux box has a 16-core processor and I am letting up to 13 Boinc tasks run at a time. In warm weather I first cut it down to 12 Boinc tasks and when it is really too hot, I cut it down to 8. I run CPDN, WCG, DENIS, Rosetta, Einstein, Universe in order of decreasing priority. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
The Linux box has a 16-core processor and I am letting up to 13 Boinc tasks run at a time. In warm weather I first cut it down to 12 Boinc tasks and when it is really too hot, I cut it down to 8. I run CPDN, WCG, DENIS, Rosetta, Einstein, Universe in order of decreasing priority.Apologies for going off track here, but there is never a reason for a CPU to be too hot. Improve the cooling system. 17 W/mK heatsink paste, bigger cooler, faster fan, etc. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Apologies for going off track here, but there is never a reason for a CPU to be too hot. Improve the cooling system. 17 W/mK heatsink paste, bigger cooler, faster fan, etc. My fans increase in speed as the box temperature, processor heat sink, etc., increase in temperature. But they do not increase fast enough, so I have diddled the BIOS to run the fans faster. But I have them set up so they make so much noise that I can't stand to run them any faster. There is no room in the box for a bigger processor heat sink. These are how my system is running at the moment. Ambient air temperature is 74F $ sensors coretemp-isa-0000 Adapter: ISA adapter Package id 0: +76.0°C (high = +88.0°C, crit = +98.0°C) Core 8: +69.0°C (high = +88.0°C, crit = +98.0°C) Core 2: +67.0°C (high = +88.0°C, crit = +98.0°C) Core 3: +71.0°C (high = +88.0°C, crit = +98.0°C) Core 5: +65.0°C (high = +88.0°C, crit = +98.0°C) Core 1: +67.0°C (high = +88.0°C, crit = +98.0°C) Core 9: +70.0°C (high = +88.0°C, crit = +98.0°C) Core 11: +76.0°C (high = +88.0°C, crit = +98.0°C) Core 12: +65.0°C (high = +88.0°C, crit = +98.0°C) amdgpu-pci-6500 Adapter: PCI adapter vddgfx: +0.96 V fan1: 2086 RPM (min = 1800 RPM, max = 6000 RPM) edge: +45.0°C (crit = +97.0°C, hyst = -273.1°C) PPT: 10.04 W (cap = 25.00 W) dell_smm-virtual-0 Adapter: Virtual device fan1: 4325 RPM fan2: 1373 RPM fan3: 3496 RPM |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,016,442 RAC: 21,024 |
I second this, I have machines with up to 128GB RAM. Limiting those to the same number of tasks as 16GB machines is illogical. Though not illogical for the forthcoming batch which is looking at variance between machines. For this batch I assume that to get as many different machines involved as possible is the aim. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
These are how my system is running at the moment.Most CPUs are fine up to 95C, and will auto-throttle at that temperature. I have an old Xeon server where one of the CPUs hangs around 95C. Some days it will throttle a little, but I don't care. The CPU stops itself getting damaged. You don't need to manually adjust Boinc. Crank the fans as high as you want for noise, then let it work as hard as it can - for example you could set the fans in the BIOS never to exceed 70% speed. The heatsink paste can make a serious difference - I've made graphics cards 20C cooler, and that takes up no space. Although you could get a larger case, I always use full towers. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
We need to change the default on that project setting of 'default max # jobs' from 'no limit' to 1 or 2. Otherwise we're back where we started. Thanks for mentioning that. If I remember right, the server is (at the moment) still set to only deliver 1-2 WUs per host for OIFS. I will check. We had lots of problems with OIFS on lower memory machines, with complaints and volunteers dropping out the project, which we don't want. There are nearly 1000 linux volunteer hosts, most of which don't read the forums, so we have to come up with a configuration that works for the majority, learn, then start catering for the people with the bigger machines. I think this is the right approach, particularly as we have not yet rolled out the multicore, higher resolutions which will take upwards of 20Gb RAM. I want to see how the community respond to these tasks first. What kind of preference setting were you thinking of? --- CPDN Visiting Scientist |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Has the default always been no limit and mine isn't (on no limit because I chose to do so)? So this would make every single person go down to 1-2 cores. Fair enough, but there will be a lot of people with powerful machines who don't read here. Perhaps a notification which appears in Boinc to say "put this up again if you have x GB RAM"? Even better would be the server intelligently limiting tasks per GB. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,016,442 RAC: 21,024 |
Has the default always been no limit and mine isn't (on no limit because I chose to do so)? So this would make every single person go down to 1-2 cores. Fair enough, but there will be a lot of people with powerful machines who don't read here. Perhaps a notification which appears in Boinc to say "put this up again if you have x GB RAM"? Default has been no limit except for I think one recent batch of high memory demand OIFS tasks. I agree that setting the limit according to the memory available would be a good idea but someone would have to request that feature over at git-hub. (I haven't checked to see if such a request has been made and if so what the response was.) |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Default has been no limit except for I think one recent batch of high memory demand OIFS tasks.As long as it's user-changeable. At the moment there is only one setting, and mine is on no limit. There would need to be a setting for each type, as there's no point in limiting other types of task like WAH2. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,016,442 RAC: 21,024 |
As long as it's user-changeable. At the moment there is only one setting, and mine is on no limit. There would need to be a setting for each type, as there's no point in limiting other types of task like WAH2. It isn't user changeable via the website like WCG have. This is purely server side and the limitation will only be on the tasks that demand a high memory. And presumably a limit of one for the batch to check if all hosts provide the same results. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
This needs to be a new thread if we are going to discuss user/project preferences at length. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
I think Glenn ought to have moderator priveledges to move things around. |
©2024 cpdn.org