Questions and Answers :
Unix/Linux :
*** Running 32bit CPDN from 64bit Linux - Discussion ***
Message board moderation
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 19 · Next
Author | Message |
---|---|
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I believe there are at least a few other WSL2 users here too. Maybe it is your hacking. :-) WSL2 works great for me with CPDN (Ubuntu 20.04), except that after a couple of weeks it just stops and I have to reboot. I think others have noted the same problem. But I run my machines 24/7. If you reboot anyway, you probably will not notice that problem. However, you then have the problem that CPDN will not always survive a reboot. You really should pause it before rebooting, and it should work. |
Send message Joined: 7 Aug 04 Posts: 2183 Credit: 64,822,615 RAC: 5,275 |
Hopefully soon there will be no need for this thread. That would be fantastic! Hopefully your effort will be successful. |
Send message Joined: 12 Apr 21 Posts: 314 Credit: 14,560,701 RAC: 18,057 |
Jim1348, I haven't noticed the problem you describe. I've also ran WSL2 24/7, the longest that I can remember is about 4 weeks with no issues. Have you noticed it with different computers? It's not frequent that I run it for extended periods of time as I'd do Windows update and reboot between batches of CPDN tasks and if no CPDN tasks, reboot whenever the updates come up. So it's possible I haven't run into it. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Have you noticed it with different computers?I have only one Win10 machine, but it has been through several updates (also updates on the Linux side), and the problem still persists for over a year. https://www.cpdn.org/forum_thread.php?id=9025&postid=63462#63462 The machine is stable otherwise. |
Send message Joined: 12 Apr 21 Posts: 314 Credit: 14,560,701 RAC: 18,057 |
Glenn, Here's where I got the single OS per model tytpe: https://www.climateprediction.net/getting-started/support/technical-faq/#why_only_available_on_one_operating_system. What you describe fits with that even more, single OS (Linux) for all models. You're looking to make the VM thing be easier to use, like Rosetta and LHC, which is good. Right now we have to manually set up some kind of virtualization, WSL2, Hyper-V, VirtualBox, to run the models that aren't for our default OSs. Thank you for clarifying the OpenIFS vs. Hadley, I suspected that they're not the same and that Hadley will likely still be around. I don't think the failure rate needs to be that high even right now. The project could alert the new users with a message at set up with a link to the message board post that has the instructions on how to obtain the proper libraries. The project could also restrict the computers that failed a small amount of tasks in a row to only 1 task a day until that computer can start showing that it can produce successful results. We have computers around that have failed hundreds and even thousands of tasks. For example: https://www.cpdn.org/results.php?hostid=1517479. Surely there are ways to restrict computers that can't produce successful results. |
Send message Joined: 15 May 09 Posts: 4530 Credit: 18,669,877 RAC: 14,865 |
Surely there are ways to restrict computers that can't produce successful results.There was a time when Andy would manually set the maximum number of work units per day on the serial killers that crashed absolutely everything to -1 (0 being no restriction.) I don't know if there is an easy way to automate this but I would like to see it done, particularly for the computers with missing libraries. I suspect that this might be a bigger problem now than it used to be with Science United where users have virtually no control over what projects their boxes run and unless they sign up to projects via the web sites as well, they won't have access to the forums to tell them how to sort things out. |
Send message Joined: 31 May 18 Posts: 53 Credit: 4,725,987 RAC: 9,174 |
Perhaps if there was a test file users could download that depends on the same libraries as the actual work files users could check their systems against it to verify everything installed properly before wasting project time and bombing actual work units. Personally I'd like to be able to grab say an application file and run ldd on it to verify I got everything installed correctly rather than waiting until I start seeing errors and ending up aborting hundreds of units until I get it fixed which is what has happened previously. |
Send message Joined: 31 May 18 Posts: 53 Credit: 4,725,987 RAC: 9,174 |
Just scored a work unit! W00T! At least now I can confirm I have all the required libs installed. |
Send message Joined: 12 Apr 21 Posts: 314 Credit: 14,560,701 RAC: 18,057 |
I've never used Arch Linux but from looking around I believe these are the 2 packages you'll need: lib32-gcc-libs and lib32-glibc. They seem to have all of the required shared libraries to run all of the currently available Linux models. You might only need the first one but can't be sure without looking at it/testing. To install them I believe you'd run the command: sudo pacman -S lib32-gcc-libs lib32-glibc Unfortunately there isn't any work available to test it out. The best you can do is install the 2 packages, be always connected, and wait. |
Send message Joined: 15 May 09 Posts: 4530 Credit: 18,669,877 RAC: 14,865 |
Might be worth looking at the BOINC message boards. WCG used to have 32bit work that required the libraries. Not sure if it still does. Searching the BOINC boards may throw up which other projects do. I believe there are a couple but can't remember which. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,194,632 RAC: 2,780 |
With a bit of luck, my Arch installation might now be CPDN-ready. I'll wait for some work, and see what happens. You may have quite wait. My last (but one) work unit was at the end of this July. It completed successfully. Since then, I have received only one work unit (yesterday), and it failed like this: <core_client_version>7.20.2</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234)</message> <stderr_txt> Model crashed: ATM_DYN : NEGATIVE THETA DETECTED. tmp/xnnuj.pipe_dummy Model crashed: ATM_DYN : NEGATIVE THETA DETECTED. tmp/xnnuj.pipe_dummy Model crashed: ATM_DYN : NEGATIVE THETA DETECTED. tmp/xnnuj.pipe_dummy Model crashed: ATM_DYN : NEGATIVE THETA DETECTED. tmp/xnnuj.pipe_dummy Model crashed: ATM_DYN : NEGATIVE THETA DETECTED. tmp/xnnuj.pipe_dummy Model crashed: ATM_DYN : NEGATIVE THETA DETECTED. tmp/xnnuj.pipe_dummy Sorry, too many model crashes! :-( 07:28:41 (795039): called boinc_finish(22) </stderr_txt> ]]> I was the last of five users to fail this way. |
Send message Joined: 15 May 09 Posts: 4530 Credit: 18,669,877 RAC: 14,865 |
I tried to get some of the four OpenIFS tasks from latest testing batch. - Got a server (feeder not running) error. Informed Andy but they were all gone when I got up this morning and saw the message that the problem had been fixed. I am hoping that most researchers will be able to swap to the OpenIFS from the Met Office models as that would get rid of the issue and with the size of recent models, any computer old enough to still be 32bit really isn't up to the job of running recent tasks. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,194,632 RAC: 2,780 |
I am hoping that most researchers will be able to swap to the OpenIFS from the Met Office models as that would get rid of the issue and with the size of recent models, any computer old enough to still be 32bit really isn't up to the job of running recent tasks. My machine is certainly 64-bit. But I also have the necessary 32-bit compatibility libraries to run MetOffice work units. Computer 1511241 Total credit 6,152,503 Average credit 0.83 CPU type GenuineIntel Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7] Number of processors 16 Operating System Linux Red Hat Enterprise Linux Red Hat Enterprise Linux 8.6 (Ootpa) [4.18.0-372.26.1.el8_6.x86_64|libc 2.28] BOINC version 7.20.2 Memory 62.28 GB Cache 16896 KB Swap space 15.62 GB Total disk space 488.04 GB [dedicated partition for Boinc] Free Disk Space 479.28 GB [dedicated partition for Boinc] Measured floating point speed 6.13 billion ops/sec Measured integer speed 26.09 billion ops/sec Average upload rate 153.25 KB/sec Average download rate 8479.32 KB/sec Must I do anything, as a regular user (not in the test group) to run these or wilt they just appear and start running when they become available? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
When they do show up, it will be "business as usual". :) |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,194,632 RAC: 2,780 |
OK, so if my 64 Gig of RAM is big enough, I should be able to run more than one at a time? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Apparently the server will decide. |
Send message Joined: 15 May 09 Posts: 4530 Credit: 18,669,877 RAC: 14,865 |
OK, so if my 64 Gig of RAM is big enough, I should be able to run more than one at a time?Should be no problems with that. One of the testing batches which were just single core, and used up to about 5GB max I was able to run 4 at once on my now dead laptop which only had 8GB or RAM. Because of using swap, even with an SSD it did slow them down a lot however. |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
The page at Boinc, installing Boinc on Linux. There is something wrong. They have changed the contents of the page recently. The command "apt" has been changed to "aptitude". Linux Mint recognises this command but cannot find the packages boinc-manager boinc-client. However, this command is not recognised by Linux Zorin. So I tried the old command "apt-get" which is still recognised but it still is unable to locate the packages boinc-manager boinc-client. Both Mint and Zorin are the latest. Some help is required. |
Send message Joined: 22 Feb 06 Posts: 490 Credit: 30,771,914 RAC: 10,979 |
Try running "apt update" before the command for to install BOINC. |
Send message Joined: 12 Apr 21 Posts: 314 Credit: 14,560,701 RAC: 18,057 |
Unless I'm missing something, it doesn't look like BOINC is available from a repository for Mint or Zorin. That'd explain why those packages aren't found. You might have to try another approach like building it from source. |
©2024 cpdn.org