Message boards : Number crunching : New work Discussion
Message board moderation
Previous · 1 . . . 25 · 26 · 27 · 28 · 29 · 30 · 31 . . . 91 · Next
Author | Message |
---|---|
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Thanks, will try suspending everything else to see if it speeds up. A few hours should show if their is going to be any significant speed up. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It is thought that processing the vegetation data as well as the usual climate data may be why the models fail just as they try to start the regional model. This adds a LOT to the hardware requirements, mostly in the memory area, which covers caches, the FPU, and the data channels between everything. So trying to cram as model tasks onto a computer as possible may well be what is exacerbating the failures for some people. As Clint Eastwood's character, Dirty Harry, once said: "A man's got to know his limitations". |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
So trying to cram as model tasks onto a computer as possible may well be what is exacerbating the failures for some people. I have certainly noticed that some tasks on my laptop (N3540 @ 2.16GHz) slow down if all four cores are crunching. When I notice this, I cut my computing down to two or three cores till the affected tasks have cleared. I would certainly say that the minimum memory should be 2GB/core these days. If things go the way of all tasks being so demanding, I will probably end up setting it to only use 75% of available CPUs. I can however understand that needing to do this might frustrate those for whom credit is more important than it is for myself. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
I would certainly say that the minimum memory should be 2GB/core these days. Well, I have four cores and 8 GBytes of RAM. Another 8 GBytes of RAM are on order and should arrive soon. Four 2GByte modules installed and four 2 GByte modules on order. My machine could hold 512 GBytes of RAM if someone else would buy me the modules -- but that would be silly for the way I use my machine these days. I currently have climateprediction set to Won't get new tasks because I run Linux most of the time, but am rebooting to Windows to run my Income Tax program. When that is done, I will be back to running Linux 24/7, and will start accepting climateprediction tasks again. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
This one has now failed with seg violation at about 9% after 2 trickles and zips. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
This one has now failed with seg violation at about 9% after 2 trickles and zips. Shucks, I had thought my two 797s were safe having both uploaded their first zip. I will carry on crunching with at least one core free to see what happens. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
Three new batches for South America: batch #802 = 500 x SAM50/13 batch #803 = 800 x SAM50/13 batch #804 = 2200 x SAM50/24 (See batch list.) |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I am not sure if it is a CPU difference or not, but all seven of the 797's have failed on my two Ryzen 2600's, but three are still going fine (after 4, 5 and 6 zips) on my i7-4771. https://www.cpdn.org/cpdnboinc/result.php?resultid=21555978 https://www.cpdn.org/cpdnboinc/result.php?resultid=21541267 https://www.cpdn.org/cpdnboinc/result.php?resultid=21555753 For that matter, it could be an OS difference, since the Ryzen 2600's are on Win10 (1809), while the i7-4771 is on Win7. None of them are rebooted much, especially the Ryzens, which are dedicated machines, and they all run 24/7. No other CPU jobs are running either, so the CPDN work is never suspended. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
Three new batches for South America: And another one! batch #805 = 2100 x SAM50/13 |
Send message Joined: 17 Jul 05 Posts: 7 Credit: 6,509,173 RAC: 854 |
I am happy to see that the new sam50 models do not give the same errors after 3-4 minutes such as the sams25 usually do on my Win10 computer. Does anybody know which things have been changed within these models? |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
I'm still waiting to see how grotesque the upload files are before all downloaded tasks are allowed to start ... "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Only that the resolution of the high res regional area is half that of the previous 25K models, and it's thought that the amount of memory suddenly needed may have something to do with the previous failures. It hasn't been discussed yet, and it's too early to guess. There's been 6 failures so far, 5 in 802, and 1 in 803. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
I thought I would give it another go and have a safr50 791 & sam50 804 running for a whole 24 hrs and still not had a fit.(:Segment violation) they are at about 13%. I dont have that warm feeling of confidence. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
There are a few failures, but well below "worrying". The project coordinator has said that the sam50s have all been run before, so should be OK. The project people have been rather busy lately, so they haven't done much research on what was wrong with the sam25s. And they won't be run again until this is known. Testing WILL be done soon to try and find out. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
And three of the 797 batch have completed successfully now. My two are past their 4th and second zips respectively so both well past where they got to on their first attempt. Hoping that keeping at least one core free will let them finish without segfaulting. Of the three that have finished, 2 are under win7, one win server2012. However of the first four listed as having completed for #798 three are win10 and one is win7 so my initial thoughts about it being a problem with win10 have I think gone out of the proverbial window. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I am still holding on to that theory. All seven of my 797's have failed in under four hours on Win10 (on two Ryzen 2600's), but all three of the 797's that I have run on my Win7 machine (i7-4771) are still going after at least seven days. I think it is the OS rather than the CPU difference, from what I have seen on other machines. |
Send message Joined: 27 Feb 08 Posts: 41 Credit: 1,402,356 RAC: 0 |
My Win10 machines have generally been fine. Regards, Bob P. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I don't see that you have even run any 797's on them. (My Win10 machines have been running fine for the most part otherwise too. But 797, 798 and 799 are problematic; maybe others too.) |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
I am still holding on to that theory. I will have another look when there is a bit more data to go on. I will also try and see if I can work out a way to identify machines like mine running Linux which pretend to be Windows 10 ;) Edit:Six completed now, the new ones since yesterday are two xp and one win7 so still no 10s. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Edit:Six completed now, the new ones since yesterday are two xp and one win7 so still no 10s. My Win7 64-bit machine is still going strong on three 797's and two 798's after 4 to 9 days, with no failures on either. I don't think that is a coincidence. https://www.cpdn.org/cpdnboinc/results.php?hostid=1466534 |
©2024 cpdn.org