Message boards :
Number crunching :
New work Discussion
Message board moderation
Author | Message |
---|---|
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
A batch of Africa models showed up a few hours ago, and now they're gone. There's over 200,000 tasks out there somewhere, so some people must have stockpiles. Which may be why some of these latest batches are showing up - the researchers aren't getting their data back. |
Send message Joined: 17 Aug 04 Posts: 289 Credit: 44,103,664 RAC: 0 |
Hi Les, hi everyone, would it be possible or would it cause problems to some crunchers, to change the deadline in BOINC from a year to say 3 or 4 months for Climate Prediction? I have two computers. One computer crunches only Climate Prediction. My second computer crunches only SETI@home. I think the deadline at SETI@home is two months. Come to think of it, I guess people who crunch Climate Prediction and other multiple projects on one computer there might be a problem not sure? Any way just a thought. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Hi Byron I asked about this some time ago, and apparently it's not going to change. This is to do with the second part of your question: It does affect those who are running lots of projects. But there is a down side for those people who take months to complete their tasks: if the researchers don't get their results in a reasonable time, they can just re-issue them (plus a few more perhaps), as a new batch, and then forget about the earlier unreturned tasks. |
Send message Joined: 17 Aug 04 Posts: 289 Credit: 44,103,664 RAC: 0 |
if the researchers don't get their results in a reasonable time, they can just re-issue them (plus a few more perhaps), as a new batch, and then forget about the earlier unreturned tasks. Hi Les, I agree with you 100%, in the last few weeks I have come across many computers with big stockpiles, 200 tasks or more sitting on their hard drives since March or April, and they have not uploaded a single Zip file for those tasks. What can be done to solve this? |
Send message Joined: 27 Jan 05 Posts: 74 Credit: 1,047,809 RAC: 0 |
I have an afr50 in the"ready to start" status while 3 others are crunching with as much as 2 days to completion. If this is common practice then that adds to the load of idle outstanding work, correct? This does not seem reasonable when there are no wu,s available but other computers are looking for work. Is the afr50 an orphan from some rejected event? |
Send message Joined: 1 Sep 04 Posts: 161 Credit: 81,522,141 RAC: 1,164 |
Les - Is it possible to run some kind of script against the database and find all the tasks that were sent out at least 3 or 4 months ago and have no trickles? Then maybe re-issue them? |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,798,786 RAC: 5,264 |
Is the afr50 an orphan from some rejected event? The name of that AFR50 is wah2_afr50_a16w_201312_13_451_010735872_0. The trailing "_0" indicates it's the first model issued in that work unit. It might have to wait a few days but it stands a much better chance of completing on your PC than elsewhere. BOINC Manager allows users to build a work buffer to smooth out the supply of models: if the supply of models was smooth in itself then perhaps the buffers would empty. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
WB8ili and Byron This may not have an elegant solution. e.g. People may download a few models, than need to be away for a while, (hospital, business trip), and have their computer turned off. Plus there's those who cycle through lots of projects, possibly on a slow computer, so they'd take a while too. I don't think the project people want to have to decide on a strategy for how long is acceptable. One of the things that physicists need to be good at, is computer programming, so I'm sure that "our" people can and probably do create and run lots of scripts and searches for lots of reasons. But what I said before, about issuing a new batch, is what I'd do. Then it'd only be the hoarders who suffered by way of wasted power costs and bandwidth time. Not sure where to insert this, just write it here. I just thought: all of this is probably a mindset problem - aka "set and forget". And some people have to learn the hard way. And things like that. :) |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
John Having waiting tasks can be normal; it all depends on your number of processors and how your cache is set up. If you're set for say 10 days, but only have 4 cores, then any extra than this will go into wait state. Or they'll swap between them, and take it in turn to run. I've left the first 2 Network usage options at blank, so I only get a new task when one already running finishes. Just my way of doing things. When the new batch of afr50's showed up, I needed some more, so I got some more. But only one was an afr50; the rest where re-sends of other batches. So I've got a motley collection running. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
John Another thing. Your "waiting" task could just as well have ended up on a computer where it would sit for e.g. 6-8 months before starting, so waiting on your computer for a week or 2 isn't a bad thing. |
Send message Joined: 27 Jan 05 Posts: 74 Credit: 1,047,809 RAC: 0 |
Thank you Iain and Les. This Dell XPS has an i7 core but I am reluctant to monkey around with the workload because of temperature concerns. (Actually I admit that I dont know how to adjust core performance or overclocking.) The Extreme Tuning Utility shows temps of 80-85C and that is mildly alarming. I am more familiar with 60-70C on the older equipment. And according to DELL, the XPS clock speed is 1.90 GHz as a means of temp control. ie slower all around performance but longer life. This machine is dedicated to CPDN so if you guys are not concerned about heat, I will try some adjustments to allow more cores and more work units. There are now 3 waiting to start. |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
Hi John, As a data point, I run an I7-3770 running at 3.4GHz with six of the eight cores available to BOINC. These six cores run virtually 100% of the time crunching multiple projects. All these CPU's run between 69 and 71 degrees C with no problems and have been running this way for over 2 years. Art |
Send message Joined: 9 Oct 04 Posts: 82 Credit: 69,916,905 RAC: 9,226 |
@Bayliss Another spin for the long deadlines: I had to change a hard disk on one of my computers. I copied the whole ProgramData/BOINC/projects folder to an USB stick before the change, changed the disc, installed BOINC, connected to the project again, downloaded some new WUs, merged the two computers with the same name on the homepage of every project, copied the old WUs from the USB to corresponding folders under projects folder. The old WUs are visible under the computer name on the homepage, but not in the computer (BOINC manager). So if nothing happens, the will be reported until the deadline as “in process” but will never be processed. So if someone might point out how to get them recognized by BOINC on this particular computer, I would be very grateful. If this would work to transfer WUs from one computer to another, would be even better, as I have some WUs on slow computers to transfer to fasters. Finally, I also had a crash of a hard disk, so I lost all this WUs, but as the deadline is so long, they won´t be reissued for a year. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It's necessary to start the copying from, and including, the boinc folder, because that contains lots of important info, such as client_state.xml, which is BOINC's "To Do" list. That's where BOINC stares information about tasks, where they're from, and where to send the results. Also in there, is the folder slots, which runs parallel to the projects folder. So without all of this, it's bye bye models. **************** Tasks are logged by the server as being sent to a particular computer, and getting the results back from a different computer can cause problems. Moving models to a faster computer has, in the past, caused the model to error out with something about a time limit being exceeded. It's been a long while, and I forget what the message said. |
Send message Joined: 27 Jan 05 Posts: 74 Credit: 1,047,809 RAC: 0 |
@ Art Masson Thanks for the info. The "Brand String " for the cpu on this Dell XPS is" i7-3517U, CPU @ 1.90GHz; now you may understand the comparisons to your setups, but I dont.. The curious thing here is the CPU shown at 1.90 GHz by Intel, but their Monitor shows Max Core Frequency (operational I assume) @2.76 GHz. CPU utilization is very low, 15%-45% variable. So the cpu may respond to the demands of the CPDN programs and jump up as required (ie auto overclocking?) hence the heat, which is never < 67C. If I solve some of this I will pass it on, because your operating characteristics indicate that something is amiss here and I could be more useful to the cause. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,946,057 RAC: 13,930 |
You could try installing the Intel Extreme Tuning Utility. This can be configured to show which cores are being used, how much, and what temperatures the cores and whole package are getting up to. You can also tweek the CPU usage in BOINC manager via Computing preferences in the options menu. |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
Hi John, Your Dell motherboard may restrict the CPU speed that you can achieve with your CPU. The Intel Extreme Tuning Utility may provide you some ways to "overclock" but I would advise you to be conservative on increasing the speed. The temperature your CPU runs at is not only a function of the load (% utilization and type of processing) you put on it, but is also constrained by how good a hardware design (i.e. heat sink design/effectiveness) Dell did for you in that machine. I would be careful not to push the utilization or CPU speed up so much that your CPU runs hotter than 85 Degrees C or so -- because that will eventually likely cause an early CPU failure running that hot. It's better to have a reliable machine consistently crunching CPDN tasks than one that is pushed so hard you get errors -- that's why I've constrained my CPU utilization to 75% in BOINC...so that only six CPU's are available. I've found that if I push to more than 6 CPU's I get consistent but random CPDN work unit processing failures. Good luck!!! Art Masson |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
And to bring this back to topic, there's some more work. Weather At Home European region. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,967,570 RAC: 21,693 |
Hopper now empty again but I suspect there is more on the way soon. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Anyone have any idea when new work will appear. I have a brand new computer and it's hungry to sink its teeth into some nice juicy Climate Models. It’s having to make do with backup projects. Please feed my computer. ;) |
©2024 cpdn.org