|
Message boards : Number crunching : Feedback on running OpenIFS large memory (16-25 Gb+) configurations requested
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
Send message Joined: 29 Oct 17 Posts: 1067 Credit: 17,020,946 RAC: 5,160 |
A thanks to all who contributed to the testing. I reported on the feedback from everyone on this thread to the CPDN Technical meeting yesterday. It was well received and our thanks to everyone. --- CPDN Visiting Scientist |
Send message Joined: 29 Oct 17 Posts: 1067 Credit: 17,020,946 RAC: 5,160 |
Regarding the GLIBC version issue highlighted in this thread, we've decided to build the next versions of the linux applications (OpenIFS, HadAM4, HadSM4) on Ubuntu 18.04 LTS as it's in long term support for years yet. Ubuntu 18.04 uses GLIBC version 2.27. This only affects new linux applications and not the current ones. --- CPDN Visiting Scientist |
Send message Joined: 1 Jan 07 Posts: 1066 Credit: 36,887,369 RAC: 1,533 |
Regarding the GLIBC version issue highlighted in this thread, we've decided to build the next versions of the linux applications (OpenIFS, HadAM4, HadSM4) on Ubuntu 18.04 LTS as it's in long term support for years yet. Ubuntu 18.04 uses GLIBC version 2.27. This only affects new linux applications and not the current ones.The exact error messages from my Linux Mint 20 machine were: ./oifs_43r3_omp_model.exe: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by ./oifs_43r3_omp_model.exe) ./oifs_43r3_omp_model.exe: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./oifs_43r3_omp_model.exe) ./oifs_43r3_omp_model.exe: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by ./oifs_43r3_omp_model.exe)That sounds as if 18.04 LTS should be good enough - but I could run a revised test if you want confirmation. |
![]() Send message Joined: 29 Nov 17 Posts: 83 Credit: 17,184,625 RAC: 13,161 |
In reply to Glenn Carver's message of 15 Oct 2024: I'm curious if you have estimate of how many hosts would be eligible.Yes, we checked the database. There are ~600 linux hosts with 32+ GB RAM. Enough to make it workable. You are doubling up to 64+ GB now ? |
Send message Joined: 27 Jan 07 Posts: 301 Credit: 3,288,263 RAC: 26,370 |
I'd be willing to try these models, under the following conditions: * CPDN server will respect BOINC compute limits (preferences) per computer and not give work for machines with insufficent resources. * Models will respect BOINC compute limits (max memory, CPU count, disk) and will not start if insufficent resource available. (i.e., in the event these limits change since initial download). * Checkpoint files are compressed (best effort). * OpenIFS models are 'opt-in' via project preferences * Sticky forum thread (can be read-only) for OpenIFS system requirements and warnings. FYI... I only have 32GB RAM and 6 cores (12 threads)... so would machines like mine be able to run them well enough? |
Send message Joined: 29 Oct 17 Posts: 1067 Credit: 17,020,946 RAC: 5,160 |
In reply to DJStarfox's message of 23 Jan 2025: I'd be willing to try these models, under the following conditions:This happens normally for all tasks. It's part of BOINC. * Models will respect BOINC compute limits (max memory, CPU count, disk) and will not start if insufficient resource available. (i.e., in the event these limits change since initial download).We found and reported a bug in the boinc client code in the way it treated a task's requested memory. It's now been fixed by David Anderson but we do need everyone to update their client version when this fixed version is released. In the meantime, we'll only configure 1 task in progress per user. * Checkpoint files are compressed (best effort).Checkpoint files don't compress very well. They also take a long time to compress. I am still weighing up the pros & cons of adding compression in. * OpenIFS models are 'opt-in' via project preferencesThey will be. * Sticky forum thread (can be read-only) for OpenIFS system requirements and warnings.Good idea. The project preferences page will also contain more info on the tasks. FYI... I only have 32GB RAM and 6 cores (12 threads)... so would machines like mine be able to run them well enough?32Gb would be enough if you are not using the machine for anything else that requires a lot of RAM. --- CPDN Visiting Scientist |
![]() Send message Joined: 15 May 09 Posts: 4559 Credit: 19,039,635 RAC: 18,944 |
but we do need everyone to update their client version when this fixed version is released. In the meantime, we'll only configure 1 task in progress per user.Presumably it is fixed in 8.1.0 which can be installed following the instruction on the BOINC download page. |
Send message Joined: 1 Jan 07 Posts: 1066 Credit: 36,887,369 RAC: 1,533 |
Presumably it is fixed in 8.1.0 which can be installed following the instruction on the BOINC download page.v8.1.0 (odd version number) is very much for 'work in progress', and is constantly changing. It can't be downloaded in a 'ready to run' form: it has to be compiled by the user from source code. Some users may be equipped to handle that process, but I don't think it can be recommended for the vast majority of our users. Instead, there's a version 8.0.4 available on the 'all versions' download page (https://boinc.berkeley.edu/download_all.php), though unfortunately not for Linux: and the instructions for building your own copy have gone AWOL from https://boinc.berkeley.edu/wiki/BuildSystem I think we probably need to engage with BOINC about getting a usable version of BOINC, and the related documentation, available for the general Linux user. But just at the moment, the key people seem to be tying themselves in knots over an incompatibility between BOINC and VirtualBox on Apple machines. |
![]() Send message Joined: 15 May 09 Posts: 4559 Credit: 19,039,635 RAC: 18,944 |
It can't be downloaded in a 'ready to run' form: it has to be compiled by the user from source code on the page with the download instructions there is now an option to choose the 8.1.0 nightly build rather than faff about installing dependencies to compile the code yourself. Instructions for 8.0.4 are also there via the dropdown menu. |
Send message Joined: 1 Jan 07 Posts: 1066 Credit: 36,887,369 RAC: 1,533 |
Yes - I was posting from a Windows machine, and checked it from Linux later. I sometimes do that when doing housekeeping - work on the Linux machine, while referring to the instructions on a Windows machine and separate screen beside it. And I haven't found a way for making that work in the current state of the BOINC documentation. Linux needs to be listed on the 'download all' page, which is otherwise cross-platform. |
Send message Joined: 29 Oct 17 Posts: 1067 Credit: 17,020,946 RAC: 5,160 |
CPDN can't recommend users download and compile the latest nightly build though, not as a general 'all-users' statement. We'll have to wait. --- CPDN Visiting Scientist |
Send message Joined: 14 Sep 08 Posts: 130 Credit: 44,254,664 RAC: 9,487 |
IMO, if we are going to ask user to do anything in addition to the opt-in, we should just provide a `app_config.xml` template and teach them how to calculate max concurrent tasks on their host. That's how many of us crunched OpenIFS tasks before and we know it's manageable. No project (or even BOINC client team) could reasonably be expected to sort out how to help users update all the dozens of common distros with ad hoc packages. Installing a self-compiled application or third-party packages while an older version exists in the distro repository can have subtle implications for dependencies and future upgrades. We should treat user systems as productions systems and it's totally fair for them to only use packages from distro repos. This means that even if 8.1.0 is released today, distros like RHEL or its derivatives are probably not going to see it in 2 years. We can't bank on the version upgrade any time soon. We don't have to make the most optimal choices, since it's more about if we can enable research that are otherwise impossible or too slow. I feel a lot of discussion here can simply wait until the workload arrives and we collect actual data of success rate and throughput. We can start with the most conservative approach and expand eligible hosts if error rate is reasonable but throughput is not enough. 1. Release to 48 or 64GB+ hosts, one task per client. Observe the error rate over a few days. This is likely safe. 2. Release to 32GB hosts, one task per client. Observe the error rate for a few days. This might be a bit risky. 3. Relax the one task per host constraint to better utilize bigger hosts. Ideally a separate option in project preferences in addition to the opt-in. This assumes users either have app_config.xml properly configured on all hosts, or they are running new enough boinc version. This is quite more risky but the throughput increase from large memory hosts can be worth the risk. If CPDN server side can configure a plan class to relax one task per host constraint only for newer clients, it could remove the risk at cost of more complexity. |
![]() Send message Joined: 15 May 09 Posts: 4559 Credit: 19,039,635 RAC: 18,944 |
In reply to Glenn Carver's message of 26 Jan 2025: CPDN can't recommend users download and compile the latest nightly build though, not as a general 'all-users' statement. We'll have to wait. Agreed or even expect them to download the testing version of BOINC using the instructions to get it via a package manager. I would guess that it may be possible to at least test the bugfix on the development site ahead of it being rolled out to normal releases. |
©2025 cpdn.org