Message boards : Number crunching : New work discussion - 2
Message board moderation
Previous · 1 . . . 24 · 25 · 26 · 27 · 28 · 29 · 30 . . . 42 · Next
Author | Message |
---|---|
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Makes sense. Should we let working tasks run to completion or abort? I have seven that have all made it to at least 4th or fifth model month.You seem to be running them faster than me and I'd like to know how. What % have you got to in what time on what CPU? |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
between 18 and 25% AMD Ryzen 7 3700X 8-Core Processor [Family 23 Model 113 Stepping 0] Though most have been paused to allow five tasks from testing branch to run. They are running fine on my box but failed on another machine first. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
between 18 and 25% AMD Ryzen 7 3700X 8-Core Processor [Family 23 Model 113 Stepping 0] Though most have been paused to allow five tasks from testing branch to run. They are running fine on my box but failed on another machine first.Your CPU should be the same speed per core as my Ryzen 9 3900XT according to Cpubenchmark. How long have they run for on the Boinc timer? I'm getting only 2% per day, but that's running 24 tasks on 24 threads. Boinc claims half of them are getting a full thread each and half are getting half a thread each (other projects don't do this, I assume CDPN overloads the cache or something). I don't turn off HT, because I find overall (with most projects) I get 50% more throughput with it on, although each task is done slightly slower. I also don't have dual channel memory, since I've found it very hard to get any sticks which the motherboard likes, so they're a mismatch, and only running at 2100 not 3200 speed. I've tried running everything from 1 to 24 CPDN tasks at a time, and the temperature of the CPU hardly changes, which is weird. No matter what I do, CPDN runs it a lot cooler than other projects, which suggests it isn't thinking as hard as MSI Afterburner and the task manager suggest (they both say 100% usage). I'm going to guess if I had decent dual channel memory and turned HT off I'd get similar times to you. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
One at 23.9% has been running 2 days 23 hours. I have my box running just 7 tasks at a time. I have found there is no increase in throughput by using hyperthreading with CPDN tasks indeed, a slight decrease in throughput once I go over 8 real cores in use. the one task I have running in a VM which downloaded without my noticing is running about 4% slower than the rest, That one is still going along with one of the six using WINE but not in the VM. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
One at 23.9% has been running 2 days 23 hours. I have my box running just 7 tasks at a time. I have found there is no increase in throughput by using hyperthreading with CPDN tasks indeed, a slight decrease in throughput once I go over 8 real cores in use. the one task I have running in a VM which downloaded without my noticing is running about 4% slower than the rest, That one is still going along with one of the six using WINE but not in the VM.Thanks for that, since tasks are rare, testing speeds is difficult, especially when Boinc reports 100% usage when it isn't. I'll assume I could be losing up to x1.5 speed from the slower RAM setting (to avoid the MB/RAM incompatibility causing crashes), and up to x2 speed from not having duel channel RAM, and each task running x2 slower due to HT (assuming overall throughput about the same, but running twice the tasks), which could make mine up to 6x slower than yours. From your measurements I'm 4x slower. I'll set my app config to say CPDN tasks require 2 threads, which will effectively turn HT off while CPDN is running. Future tasks will run 12 at a time. The current 24 will run one half then the other, which I suppose isn't a bad thing, since half will be completed earlier. <app_config> <app_version> <app_name>wah2</app_name> <plan_class></plan_class> <avg_ncpus>2</avg_ncpus> </app_version> </app_config> Do you know if the same applies to other processors? Is it always best to run half the total threads? What about CPUs with no HT? |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
What I don't know yes is how useful the results will be. Glen is probably about now comparing some files from a testing task I am running with some from the same task that crashed on his machine. Apparently, WINE emulation comes with memory guards to prevent references to memory outside the space of the program To quote from what Glen told me. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
What I don't know yes is how useful the results will be. Glen is probably about now comparing some files from a testing task I am running with some from the same task that crashed on his machine. Apparently, WINEMine are all real Windows machines, so Windows must do the same. Can you recommend how many threads I should use on my other machines? |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,476,460 RAC: 15,681 |
What I don't know yes is how useful the results will be. Glen is probably about now comparing some files from a testing task I am running with some from the same task that crashed on his machine. Apparently, WINEWorking with Sarah at CPDN today and running tests we've found there's a data problem when the regional model starts up after the global model has run the first day (the global model has to compute the boundary values for the region). So we've isolated the cause of the crash but don't yet have the solution. To answer Dave's earlier question, I think results from this batch are suspect. Cheers, Glenn |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Running CPDN tasks I would always go for N/2 -1 assuming sufficient memory on a machine I was using for other purposes. If only for crunching I would use half the threads. I think results from this batch are suspect. I won't abort them yet. All bar two are suspended while I run the testing branch ones which is reducing the build up of zips to transfer once the server is working again. Edit:Uploads are now working. Just seen message from Andy and can confirm mine are going. |
Send message Joined: 2 Oct 06 Posts: 54 Credit: 27,309,613 RAC: 28,128 |
Edit:Uploads are now working. Just seen message from Andy and can confirm mine are going. Uploads are still not working for me. "Transient HTTP error". |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Just retried mine and they're going, consuming my full linespeed of 6 or 7 Mbit across 7 uploads. I can't guarantee they'll complete, but they all started without a fuss.Edit:Uploads are now working. Just seen message from Andy and can confirm mine are going.Uploads are still not working for me. "Transient HTTP error". Me and Dave are both in the UK and you're in America, not sure if that makes a difference. I know they download from the UK, but I don't know where these upload to, it was New Zealand last time. EDIT: Now completed. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Running CPDN tasks I would always go for N/2 -1 assuming sufficient memory on a machine I was using for other purposes. If only for crunching I would use half the threads.Ok I'll do the same, I've set all machines to count CPDN as 2 threads per task. Can I assume the Linux tasks should be treated the same way? |
Send message Joined: 22 Feb 11 Posts: 32 Credit: 226,546 RAC: 4,080 |
Consumes 10 mbit of 100 with 8 uploads |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Consumes 10 mbit of 100 with 8 uploadsGrrrr I'm still in the 3rd world over here, due to my telecoms provider having my street connected to the next town! |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,736,855 RAC: 4,073 |
Over the last couple of days I've had half a dozen tasks (first for some time). Sadly all have ended very quickly with errors within a couple of minutes of starting. The majority of the errors are "too many results". Could this be a bad batch of tasks? Edit to add: Tasks were Weather At Home 2 (wah2) v8.24 Example link https://www.cpdn.org/result.php?resultid=22330473 Windows 10, Ryzen with 32Mb RAM |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Yes, if you'd read up the page a bit.... |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Could this be a bad batch of tasks?If you look further back in this thread you will see that indeed there is a problem with this batch. They have stopped resends and are investigating the precise nature of the problem and hopefully a fix. Edit: The file not found error is because the model crashed and the zip files have not been created to upload. The segmentation error happens first.As discussed elsewhere in this thread, I would not run more than 7 or 8 tasks at once as going into virtual cores with hyperthreading actually reduces throughput of tasks, or it does with my machine which has the same CPU and the same amount of RAM. |
Send message Joined: 22 Feb 11 Posts: 32 Credit: 226,546 RAC: 4,080 |
No wonder they are crashing. You have only 32 mb of ram. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
No wonder they are crashing. You have only 32 mb of ram.If you actually look at his computer's page it is 32GB. I doubt if you could find the RAM to fit on a Ryzen motherboard to give it 32MB these days! |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
No wonder they are crashing. You have only 32 mb of ram. I assume you meant 32 GB ... Do they really make machines with only 32 MB of RAM? And even if someone still has such a machine, will it run any nearly current version of Windows? Even Windows 7? I have two machines. One is Linux-only and it has 128 GBytes of RAM (ID: 1511241) and the other is Windows 10, a pipsqueak with onlly 16 GBytes of RAM (Computer 1512658). All the current batch have failed in 3 minutes or less, as have all the other machines which have worked on the same work units. These programs are running Weather At Home 2 (wah2) v8.24 windows_intelx86 But that machine ran that program successfully many times, most recently last August, with many successes and many failures; I would guess the same number of successes and of failures. Since then, that machine has received no CPDN work until this most recent batch. So it does not seem to be a memory size problem to me. I should be able to run one of these at a time, and my app_config.xml file only allows one at a time anyway. |
©2024 cpdn.org