Info | Message |
---|---|
1) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71166 Posted 2 Aug 2024 by Conan |
I didn't have libnsl.so.1 on my computer so I have now loaded it in case I need it later, Conan |
2) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71149 Posted 1 Aug 2024 by Conan |
Mine didn't, after the 20th trickle about when it was finishing it then failed on File Transfer Unable to load library wah2_se_8.27_i686-pc-linux-gnu.so dlopen error: libnsl.so.1: cannot open shared object file: No such file or directory I must not have had any 32 bit libraries installed and so it could not find it Have now installed the file it is complaining about, even though I probably wont need it as I wanted to just have 63 bit applications running. Conan |
3) Message boards : Number crunching : Batch 1017 Errors
Message 70977 Posted 13 Jun 2024 by Conan |
Sorry the last 7 work units failed, but not due to faulty work units. I ran out of memory when another programme started up using 1 GB per work unit and launched 22 of them, normally not a problem but with 2 Climate Prediction WUs running using 3 to 5 GB each I had nothing left. It took a while to get control of the computer back and then I aborted the other project and set to No New Work which should stop it from happening again. Conan |
4) Message boards : Number crunching : Batch 1017 Errors
Message 70976 Posted 13 Jun 2024 by Conan |
The resent tasks are now running correctly and I completed one successfully with a few more running. Thanks Conan |
5) Message boards : Number crunching : Batch 1017 Errors
Message 70951 Posted 8 Jun 2024 by Conan |
Next 2 failed the same way My hosts are visible so you can see the error messages I am running Linux Fedora 37 on a Ryzen 8 7900x and a 5900x. the 5900 has not returned a result yet Conan |
6) Message boards : Number crunching : Batch 1017 Errors
Message 70949 Posted 8 Jun 2024 by Conan |
Great to get some work after a very long time. However two completed work units show an error after the 2nd trickle has been uploaded. I think this is after the 14th zip file </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>oifs_43r3_bl_a05v_2016092300_20_1017_12282038_0_r1427327128_15.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>oifs_43r3_bl_a05v_2016092300_20_1017_12282038_0_r1427327128_16.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>oifs_43r3_bl_a05v_2016092300_20_1017_12282038_0_r1427327128_17.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>oifs_43r3_bl_a05v_2016092300_20_1017_12282038_0_r1427327128_18.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>oifs_43r3_bl_a05v_2016092300_20_1017_12282038_0_r1427327128_19.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> Ran for almost 6 and half hours before failing. Conan |
7) Message boards : Number crunching : New work discussion - 2
Message 69557 Posted 2 Sep 2023 by Conan |
Until we get more experience with volunteers running these high memory apps I think it makes sense to restrict it to a single task for now. We can change it later in light of experience.LHC's ATLAS tasks at 10GB are the biggest I know of. But that's 8 threads, so you don't get people trying to run huge numbers of them. Are yours going to be single threads? YOYO@home ECM/P2 tasks take at least 11 GB per task, single thread. Which is why I stopped running them on my 32 GB machine and limit them to just 3 at a time on my 64GB machine, they are real memory hogs. Conan |
8) Message boards : Number crunching : New work discussion - 2
Message 69537 Posted 28 Aug 2023 by Conan |
Any new work for 64 bit coming along? I noticed a couple of new entries on the server status page OpenIFS 43r3 OpenIFS 43r3 Baroclinic Lifecycle OpenIFS 43r3 Perturbed Surface OpenIFS 43r3 Cubic Octahedral grid tco95 l91 OpenIFS 43r3 Linear grid tl255 l91 Thanks Conan |
9) Message boards : Number crunching : New work discussion - 2
Message 68914 Posted 18 Jun 2023 by Conan |
Although not related to new work but following on from the last couple of posts, CMDock uses a wrapper and it shows under Linux, I believe that YAFU also uses a wrapper and possibly YOYO, SRBase, TNGrid? and a few others. In some cases it is needed due to the type of programme being used or the code it has been written in. A few other projects also use a "Trickle up" method to keep the Server updated with progress (Primegrid is one) and some of these projects need a wrapper for this purpose. Conan |
10) Message boards : Number crunching : Server Status page questions
Message 68604 Posted 19 Mar 2023 by Conan |
I have also wondered about the server page. UK Met Office Coupled Model Full Resolution Ocean has had 927 tasks "in progress" for many months but I have seen no indication that any have been returned and the number never changes. Weather At Home 2 (wah2) (region independent) has 4,731 tasks in progress again for many months and again I have not seen any activity with this either (maybe 1 came back 4 months ago but can't be sure). What is happening with these work units? Conan |
11) Message boards : Number crunching : Upload server is out of disk space
Message 67724 Posted 14 Jan 2023 by Conan |
Hi Kali, Actually Dave, Hobart is in Tasmania, Australia. Not NZ (New Zealand). Conan |
12) Message boards : Number crunching : The uploads are stuck
Message 67538 Posted 11 Jan 2023 by Conan |
Yes I am still seeing "connect(): failed" messages on all upload tries. It has changed to "transient HTTP error" now so still not working here yet (Australia). Server Status has not changed yet, still showing nothing. Conan PS: Some files are now moving, so possibly due to the load, some fail then must retry later, others are going through, some as low as 17 kB/s to as high as 1,700 kB/s. |
13) Message boards : Number crunching : The uploads are stuck
Message 67525 Posted 10 Jan 2023 by Conan |
Yes I am still seeing "connect(): failed" messages on all upload tries. But I still have 4 work units running and I am no where near filling up any disks, so no problem here. Conan |
14) Message boards : Number crunching : Tasks failing on Ubuntu 22
Message 67347 Posted 5 Jan 2023 by Conan |
If you changed the option to "leave tasks in memory" but did not read the file to update BOINC with the change it may not work until it is read. Restarting BOINC would also read the file. Conan |
15) Message boards : Number crunching : Hardware for new models.
Message 67296 Posted 4 Jan 2023 by Conan |
I saw some test results with the AMD RYZEN 5950X, RYZEN 7950X, INTEL 12900 and INTEL 13900 (I think they were the model names). When all under full load for what ever test they were doing RYZEN 9 5950X used 130 Watts RYZEN 9 7950X used 270 Watts (or there abouts) INTEL 12900 used 285-290 Watts (or there abouts) INTEL 13900 used 315 Watts (or there abouts) Can't point you to the tests but they were on Youtube along with other showing similar results. So the RYZEN 5950X may not be as powerful as the new models but for energy efficiency hard to beat. That's of course if you can find them, they are getting harder to find. I run a RYZEN 9 5900X which has 12 cores + 12 threads which should use even less power as it has less cores than the 5950X. It has 64 GB of RAM and along with a full compliment of other BOINC projects easily runs 9 CPDN work units at a time. Only gets to about 42 GB max depending what I am running at the time (everything not just CPDN) (it may get higher than 42 GB but I have the head room to cover that.) BOINC has not downloaded more than 9 work units at any one time, probably because I am running a lot of other projects at the same time. Conan |
16) Message boards : Number crunching : OpenIFS Discussion
Message 66999 Posted 22 Dec 2022 by Conan |
All 9 work units that I had running overnight have completed successfully. Running on an AMD Ryzen 9 5900x, 64GB RAM, all 24 threads used to run BOINC programmes at the same time as the ClimatePrediction models. All took around 17 hours 10 minutes run time. Conan |
17) Message boards : Number crunching : Late Validation pending
Message 66991 Posted 21 Dec 2022 by Conan |
Well it seems that these files have finally been validated and I have been awarded credit for them, I think. I have noticed a clean up/out has taken place and a lot of the old past work units that I have done over the years has been removed. Those 2 pending jobs among them. I was awarded some small amount of credit this week when I have not done any work and now it seems that the database has had a bit of a clean out and fix up. Good to see. Conan |
18) Message boards : Number crunching : OpenIFS Discussion
Message 66990 Posted 21 Dec 2022 by Conan |
G'Day Glenn, You may of miss read what I wrote I think. The 11.3 GB was not a file size but the amount of disk writes made in that first 2 hours (now after 5 hours well over 30 Gb). The 2.7 to 4.6 GB were RAM amounts that each work unit was using. This was all taken from System Monitor. I did what you have asked and % cd slots/26 % du -hs . # note the '.' 1.2G . This is the same as your example. % cd projects/climateprediction.net % du -hs . 1.2G . This is similar to your example. du -hs srf* 768 MB srf00370000.0001 So all running fine, so maybe just a bit of a misunderstanding I think with data amounts and RAM usage. Thanks Conan |
19) Message boards : Number crunching : OpenIFS Discussion
Message 66983 Posted 21 Dec 2022 by Conan |
These Oifs _ps tasks really test your system out. Running 9 at once, each using from 2.7 to 4.2 GB of RAM, after 2 hours run time they have written 11.3 GB of data to disk each (101.7 GB), which is huge. Hitting 50 GB of RAM in use out of 64 GB, but I am also running LODA tasks which each use 1 GB of RAM. All 24 threads are running. 12% in and running fine so far. Conan |
20) Message boards : Number crunching : OpenIFS Discussion
Message 66795 Posted 6 Dec 2022 by Conan |
My resent task 22249228 has been sent out twice before. I completed Task 22249324 successfully in just under 17 1/2 hours. |
©2024 cpdn.org