Message boards : Number crunching : Welcome back/checking if everything is working?
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 15 May 09 Posts: 4552 Credit: 19,039,635 RAC: 18,944 |
Hi Guys, Les has contacted the project, some cleaning up will be done but probably not before some more work appears which will be part of the new season Msc programme which should in the next few weeks have work for both Windows and Linux machines. (Not sure about Mac. |
Send message Joined: 17 Jan 05 Posts: 10 Credit: 23,525,643 RAC: 0 |
Should be OK now. yes, they went through. Thank You. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
I presume these will still require some 32bit libs and not the full blown 64bit jobbies. For linux w/u. I had better make sure the fedora 30 hard disk is plugged in. Assuming I am "lucky" to snare a w/u that is. |
Send message Joined: 15 May 09 Posts: 4552 Credit: 19,039,635 RAC: 18,944 |
I presume these will still require some 32bit libs and not the full blown 64bit jobbies. For linux w/u. I had better make sure the fedora 30 hard disk is plugged in. For Linux the tasks will be N216 Hadam4 tasks and hadcm3 so yes do make sure the relevant 32bit libraries for your distribution are installed. That said, the last testing Hadam4 tasks I ran on my new box with a clean install of xubuntu20.04 ran without my installing them explicitly. I did go on and install the ones that were not there. It may have been that I installed what was needed for them while installing everything including the kitchen sink to enable me to compile BOINC from source. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,919,008 RAC: 6,904 |
There's quite a few batches of those, with only a small number left in each one. It would be great if some clean up happens. I have one orphaned Full Resolution Ocean since 2014 in my "In progress" web tab and set to expire in 2023. I'm almost there. |
Send message Joined: 9 Dec 05 Posts: 118 Credit: 12,567,835 RAC: 1,448 |
There's quite a few batches of those, with only a small number left in each one. Can they do the wiping as they need to have all the old tasks available when counting the credits? So nothing can be purged? |
Send message Joined: 15 May 09 Posts: 4552 Credit: 19,039,635 RAC: 18,944 |
Can they do the wiping as they need to have all the old tasks available when counting the credits? So nothing can be purged? If everything was wiped that was past the deadline anything being wiped would I guess miss out on credits making CPDN policy on this the same as most other projects where credit is not granted after the deadline. My personal opinion so not with my moderator hat on or expressing any views of the project is that while we have the very long deadlines currently in use this would be no bad thing. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The answer is, that the old data isn't going to be removed. So, just don't stare at those numbers for long periods. :) |
Send message Joined: 15 May 09 Posts: 4552 Credit: 19,039,635 RAC: 18,944 |
The answer is, that the old data isn't going to be removed. Shucks :) |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
The answer is, that the old data isn't going to be removed. On the server status page, are those numbers real? Have people still got all those tasks and they've not past the deadline yet? And why do we have a year's deadline for tasks that take 1-3 weeks? Rosetta for example has a 3 day deadline for 8 hour tasks. Most projects have a 2-3 week deadline. I see no advantage to the project, the scientists, or the volunteers, in letting people just store away tasks and never get round to doing them. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Some/Parts of the numbers are not real. They're left overs from the old system we had, and are there for historical reasons. The project people are happy with the way that page is, so that's how it will stay. The 1 year "deadline", as has been pointed out many times, doesn't apply to the tasks; it's there to stop BOINC from having problems when computers are also running other projects, most of which have much shorter task run times. The deadline here is: ASAP! The project controls things by closing a batch when the researcher has enough data to work with. This prevents computers from returning more results, and from getting more credits. People learn sooner or later. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Some/Parts of the numbers are not real. They're left overs from the old system we had, and are there for historical reasons.Isn't the page to show the volunteers how much is available etc? It seems to serve no purpose at all, since the numbers bear no relation to anything. The 1 year "deadline", as has been pointed out many times, doesn't apply to the tasks; it's there to stop BOINC from having problems when computers are also running other projects, most of which have much shorter task run times.I see. I did think they were rather long deadlines and I'd get them done much quicker than that. So I guess it's to allow Boinc to pause your huge tasks to finish off one on another project that has to be done by tomorrow? Yip, I can see that happening right now, to some extent, with some Rosetta that are due shortly. Although Boinc isn't very good at managing multicore and singlecore tasks at the same time. I also run Primegrid, and Boinc has managed to get a computer running a 4 core Primegrid and 2 singlecore Rosettas on a 4 core CPU..... You see, if it was sensible, it could see the Primegrid has oodles of time to finish, the two Rosettas are urgent, so just run the Rosettas and download two other single core tasks to keep it busy, leaving the Primegrid till later. The project controls things by closing a batch when the researcher has enough data to work with.I see it's trickling up partial results from my tasks and crediting me. And a couple of my tasks failed due to a computer restart which seems to corrupt something. So are the partial results useful? Will the remainder of that task be sent out as a retread, or the whole thing from the start? People learn sooner or later. So if someone leaves it too late to send it back, can the server tell their computer to abort? Or does it sit crunching for weeks pointlessly and without credit? If the latter, the user will probably never know unless they're keeping a very close eye on their credits. |
Send message Joined: 15 May 09 Posts: 4552 Credit: 19,039,635 RAC: 18,944 |
I see it's trickling up partial results from my tasks and crediting me. And a couple of my tasks failed due to a computer restart which seems to corrupt something. So are the partial results useful? Will the remainder of that task be sent out as a retread, or the whole thing from the start? The whole task is sent out again. So if someone leaves it too late to send it back, can the server tell their computer to abort? Or does it sit crunching for weeks pointlessly and without credit? If the latter, the user will probably never know unless they're keeping a very close eye on their credits. The server can but it doesn't. About the only time CPDN uses that feature of the BOINC server code is when there are serious problems with a batch. I don't have much sympathy for those who will miss out on the credits as they will be people who don't look at their computers often enough to notice anyway. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
That's a shame. If my computer can store a checkpoint and continue after switching tasks, running an exclusive application, or restarting the computer, can't that be used to tell someone else's PC how to continue? Or is the checkpoint CPU-specific?I see it's trickling up partial results from my tasks and crediting me. And a couple of my tasks failed due to a computer restart which seems to corrupt something. So are the partial results useful? Will the remainder of that task be sent out as a retread, or the whole thing from the start?The whole task is sent out again. Why not send out a cancel message? You're just wasting the CPU time on someone's computer, doing work that will never be used.So if someone leaves it too late to send it back, can the server tell their computer to abort? Or does it sit crunching for weeks pointlessly and without credit? If the latter, the user will probably never know unless they're keeping a very close eye on their credits.The server can but it doesn't. About the only time CPDN uses that feature of the BOINC server code is when there are serious problems with a batch. I don't have much sympathy for those who will miss out on the credits as they will be people who don't look at their computers often enough to notice anyway. The trouble is your 1 year deadline is making my Boinc client put them on the back burner and do other tasks with shorter deadlines. I've had to manually suspend Primegrid tasks to let yours continue. Mind you that's also the fault of Boinc/Primegrid for giving me 3 weeks of processing to do when my buffer is set to 3 hours! |
Send message Joined: 15 May 09 Posts: 4552 Credit: 19,039,635 RAC: 18,944 |
That's a shame. If my computer can store a checkpoint and continue after switching tasks, running an exclusive application, or restarting the computer, can't that be used to tell someone else's PC how to continue? Or is the checkpoint CPU-specific? CPDN tasks are a bit strange in that the same task if sent to two different computers each completing it may not produce the exact same data. Statistical methods are used to determine which results are useful and which are not. There can be differences between AMD and Intel processors and even differences between different CPU's by the same manufacturer. There used to be tasks that would go out to both Window and Linux machines but this was stopped in order to reduce the number of variables. Having a task start on one machine and finish on another could cause more problems. Another issue is that the code for these tasks is propitiatory from the Met Office and the license Oxford has from them doesn't let them mess about with it to any great extent so they would have to write their own code to interface with that from the Met Office to produce the partial tasks. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
That's a shame. If my computer can store a checkpoint and continue after switching tasks, running an exclusive application, or restarting the computer, can't that be used to tell someone else's PC how to continue? Or is the checkpoint CPU-specific? That's not unique to CPDN. I've seen the same problem at other projects, specifically with GPU and CPU versions of the same task. I think it was a programmer at WCG working on a GPU version (which they don't currently have), was saying there was an additional 6% error margin on the GPU version. I always thought computers were precise, so I don't understand how a program can give an approximate answer! I guess it's compounding of rounding errors? |
Send message Joined: 9 Dec 05 Posts: 118 Credit: 12,567,835 RAC: 1,448 |
The trouble is your 1 year deadline is making my Boinc client put them on the back burner and do other tasks with shorter deadlines. I've had to manually suspend Primegrid tasks to let yours continue. Mind you that's also the fault of Boinc/Primegrid for giving me 3 weeks of processing to do when my buffer is set to 3 hours! Boinc is designed to run tasks in FIFO order so that shorter deadline tasks don't take over the resources (CPU/GPU). Only exception is if Boinc thinks that a task is going to miss the deadline, then that task is expedited. But this happens only about 1 -1½ days before the deadline. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
That's not what I've seen. It does FIFO order within each project, but it uses what they call "short term debt" to decide to take a task from project A or project B if you have several tasks queued for each. You can see it happen if you change the weighting of a project, so Boinc tries to meet that new weighting by only doing the tasks it has for the higher weighting project.The trouble is your 1 year deadline is making my Boinc client put them on the back burner and do other tasks with shorter deadlines. I've had to manually suspend Primegrid tasks to let yours continue. Mind you that's also the fault of Boinc/Primegrid for giving me 3 weeks of processing to do when my buffer is set to 3 hours! As for the panic mode, it always does that slightly too late! It's approximately at the time it needs to complete it - eg. a task needs 5 hours to run, it will start it 6 hours before the deadline, which is no good if the computer is turned off or plays a game! Anyway, mine is in panic mode because Primegrid gave me too much work to do. So Boinc has correctly assumed that CPDN doesn't need them back for a year so was only doing Primegrid until I intervened. I really can't see the problem in changing the deadline to say 1 month (or whatever is long enough for most people to be able to do them). |
Send message Joined: 15 May 09 Posts: 4552 Credit: 19,039,635 RAC: 18,944 |
How projects play together is something I know little about because I only run CPDN tasks except when none are available so my knowledge of it is nearly all from reading posts here and on the BOINC fora. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
How projects play together is something I know little about because I only run CPDN tasks except when none are available so my knowledge of it is nearly all from reading posts here and on the BOINC fora. Do you manage to have CPDN running non stop? Is there enough Linux work to keep it busy? Or do you just let the computer doze off inbetween? I like my 66 CPU cores and 4 GPUs to be doing something all the time. My wallet does not. |
©2025 cpdn.org