Message boards : Number crunching : New work Discussion
Message board moderation
Previous · 1 . . . 73 · 74 · 75 · 76 · 77 · 78 · 79 . . . 91 · Next
Author | Message |
---|---|
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
This new model type is still at the Alpha testing stage. ____________________ We have our Africa Rainfall Project. MCM, Mapping Cancer Markers, I think, do not clean up the RAM after finishing. Please check, I am suspicious about them. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
We have our Africa Rainfall Project. MCM, Mapping Cancer Markers, I think, do not clean up the RAM after finishing. Please check, I am suspicious about them. I run WCG ARP1 and MCM1 and they do clean up after finishing. While I allow Beta Testing work units, I have not received any in a long time. Beta Testing Intermittent 0:060:21:02:33 147 83,416 |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I run WCG ARP1 and MCM1 and they do clean up after finishing. That is my experience too. I have been running them from the beginning with no problems. Currently, I have one machine on each, but sometimes mix them with other projects. |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
We have our Africa Rainfall Project. MCM, Mapping Cancer Markers, I think, do not clean up the RAM after finishing. Please check, I am suspicious about them. ____________ You might be on Windows, while mine is on Linux. What happens is, if I run ARP after a re-boot the WU runs happily. If I run ARP after I have done a few MCM Wu's because of the queue, it after a while says, waiting for memory. Where has the memory gone? Anyway, I was running it on Linux because of CPDN. I will revert back to Windows. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
You might be on Windows, while mine is on Linux. What happens is, if I run ARP after a re-boot the WU runs happily. If I run ARP after I have done a few MCM Wu's because of the queue, it after a while says, waiting for memory. Where has the memory gone? Anyway, I was running it on Linux because of CPDN. I will revert back to Windows. Actually, I do most of my Boinc work on my Linux machine. Computer 1511241 Computer information Created 14 Nov 2020, 15:37:02 UTC Total credit 5,443,321 Average credit 8,696.18 CPU type GenuineIntel Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7] Number of processors 16 Operating System Red Hat Enterprise Linux 8.5 (Ootpa) [4.18.0-348.12.2.el8_5.x86_64|libc 2.28 (GNU libc)] BOINC version 7.16.11 Memory 62.4 GB Cache 16896 KB Something must be strange in how you run your system. I have never run out of memory running Linux, and I have been running it since1998 in various versions and on a variety of machines. All versions have been Red Hat. If you wish to run Windows, go ahead. But that does not solve the problem. |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
You might be on Windows, while mine is on Linux. What happens is, if I run ARP after a re-boot the WU runs happily. If I run ARP after I have done a few MCM Wu's because of the queue, it after a while says, waiting for memory. Where has the memory gone? Anyway, I was running it on Linux because of CPDN. I will revert back to Windows. VM. 10GB RAM. Mint. My Windows is also running Boinc. I will just shut down the VM until there is more CPDN, Linux work. Then the Windows, BOINC can have the full 16GB. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,025,554 RAC: 20,468 |
Rather than reboot, if you get this memory problem in Linux run sync; echo 1 | sudo tee /proc/sys/vm/drop_caches This will free up memory not released when tasks end. (I use it mostly when Firefox is being tardy about releasing memory after I have had a lot of tabs open.) |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Rather than reboot, if you get this memory problem in Linux run I suppose that would work under some circumstances. Since I have never run out of RAM, even in those days when I had only 64 MegaBytes of it, or even less, I am certainly not an authority on dealing with this problem. I have run Netscape, Mozilla, and Firefox mainly as browsers. I can run Chromium, but I do not like it. So if I were go too far running too many memory hog processes at once, the performance would go to hell because of the memory thrashing to disk, but nothing breaks. But consider that as you start needing more memory to start a process, the first thing the kernel will do is start using the CPU RAM input cache. If that is used up, it will page out the CPU RAM output buffer space. If yet more RAM is needed, it will start paging out some of the RAM still in use (on an LRU basis). Only if the swap space on disk is exhausted, will the kernel invoke a process killer so that vital processes can run. As I said earlier, this has never happened to me in 24 years or so running Linux. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Rather than reboot, if you get this memory problem in Linux run It sure does work. Could the O.P. just have not enough RAM? My machine before and after running sync, No CPDN work at the moment. # free -hw total used free shared buffers cache available Mem: 62Gi 5.3Gi 1.8Gi 130Mi 327Mi 54Gi 56Gi Swap: 15Gi 70Mi 15Gi # sync; echo 1 | sudo tee /proc/sys/vm/drop_caches 1 l# free -hw total used free shared buffers cache available Mem: 62Gi 5.3Gi 55Gi 118Mi 0.0Ki 1.3Gi 56Gi Swap: 15Gi 70Mi 15Gi ps -fu boinc UID PID C TIME CMD boinc 2072 0 00:11:52 /usr/bin/boinc [boinc-client] boinc 519234 93 06:35:21 ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu -beta - boinc 520042 91 06:08:27 ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu -beta - boinc 522179 98 05:51:09 ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu -beta - boinc 528257 98 03:48:01 ../../projects/www.worldcommunitygrid.org/wcgrid_arp1_wrf_7.32_x86_64-pc-linux-gnu boinc 536973 98 01:26:41 ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_map_7.61_x86_64-pc-linux-gnu boinc 539412 98 00:50:44 ../../projects/www.worldcommunitygrid.org/wcgrid_opn1_autodock_7.21_x86_64-pc-linu boinc 540515 99 00:32:54 ../../projects/universeathome.pl_universe/BHspin2_19_x86_64-pc-linux-gnu boinc 540526 99 00:32:36 ../../projects/www.worldcommunitygrid.org/wcgrid_opn1_autodock_7.21_x86_64-pc-linu |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,815,352 RAC: 5,242 |
Three batch #926 models have now completed successfully on my very slow Mac, so anecdotally it looks like the switch by the project of that batch to Mac-only was a good decision. The batch #927 Mac-only models are running fine so far (presumably they are the same set as batch #926), though a twice-failed model has just downloaded, so there is clearly the usual background rate of failures. No pattern that I can see yet. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
I just upgraded yesterday to boinc-manager-7.16.11-9.el8.x86_64 boinc-client-7.16.11-9.el8.x86_64 They were boinc-manager-7.16.11-8.el8.x86_64 boinc-client-7.16.11-8.el8.x86_64 For several years before that. This is the latest version for my Red Hat Enterprise Linux release 8.5 (Ootpa) release of Linux. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Don There hasn't been any work for Windows machines for a long time now. It's all Linux. And that has run out too, leaving only a few Mac tasks, that were originally part of a Linux/Mac batch. |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
Rather than reboot, if you get this memory problem in Linux run Thank you, Dave. As it is, I am out of Linux WU's. Let us see if we get a new lot. |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
Rather than reboot, if you get this memory problem in Linux run The next machine of mine will be dedicated to Linux. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
Don't drop caches, all you're doing there is telling Linux to free up memory of pages it's read off disk, so it has to touch disk again. Linux "using all the memory" is a feature, it will use any surplus memory to keep disk pages in RAM such that it can improve disk speed access, but it will drop those when more memory is needed before evicting other things out to swap. About the only time that's useful is when you're doing benchmarking runs that involve a lot of disk IO you want to test with the full normal system caching behavior, but want to clear it between runs. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,025,554 RAC: 20,468 |
Don't drop caches, all you're doing there is telling Linux to free up memory of pages it's read off disk, so it has to touch disk again. My experience is that flushing memory using this command makes the suspend either to disk or to RAM function much more reliable. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,025,554 RAC: 20,468 |
Following sorting out some problems with some of the input files over on testing site, there should be some windows tasks for NZ region in the next week or so. There are also some N144 Linux tasks in the pipeline which had some issues stopping them getting onto main site but not sure if those are resolved yet. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,025,554 RAC: 20,468 |
There are also some N144 Linux tasks in the pipeline which had some issues stopping them getting onto main site but not sure if those are resolved yet. Unless something unexpected is broken this submission has started but there are some files being generated by the submission script that are taking a while so seven hours after it started, nothing has appeared yet. My guess is that they will appear by tomorrow morning at the latest if nothing unexpected pops up. (Speaking ex cathedra from my belly button) |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,025,554 RAC: 20,468 |
Second batch also being prepared by no sign of the first appearing on the server yet. Waiting to see if this is a problem or the file perturbations taking a long time. Tasks have started appearing on the server. Linux only for this lot unless you use WSL or a VM of some description. In three minutes time when my box updates after the backoff from last attempt to get work I will be able to confirm that they are downloading OK. Edit: four tasks downloaded and running. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Tasks have started appearing on the server. Linux only for this lot unless you use WSL or a VM of some description. In three minutes time when my box updates after the backoff from last attempt to get work I will be able to confirm that they are downloading OK. I got a task recently and it has over an hour of execution time on it; i.e., way more than the 30 seconds that the fast-crashers or recent memory used to do. Task 22198935 Name hadam4_a11l_200010_13_928_012134123_0 Workunit 12134123 |
©2024 cpdn.org