Message boards : Number crunching : Lost tasks
Message board moderation
Author | Message |
---|---|
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
Due to a moment of absolute madness last week I managed to totally delete the Win10 operating system on my main machine and with it the VM installations running Ubuntu - and hence 6 recently started tasks!! (:-. I am assuming that these incomplete tasks will now sit on the server until there expiry date which is November this year. Hope that nobody needs the results in the meantime. Surely with todays faster computers the "expiry date" should be more in the region of 3 months rather than 10 to 12 that seems to be standard. The upside of this error is that computer now has Ubuntu installed as the OS which has so far proved to be more stable than the VM which kept "freezing". The initial downside is that BOINC then decided that the estimated completion time for the four downloaded tasks was 64days! That was six days ago - and those tasks are now nearing 50% completion with a small reduction in remaining time to 35days. Hopefully when the first of these tasks finishes the times will be more sensible. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
I agree, one year seems way too long and three months much more reasonable. As for the time estimation for the tasks, run a benchmark, and at least the next tasks that are downloaded with have a much better guess at time to complete. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,004,017 RAC: 21,574 |
Sadly, it has been suggested several times to the project that they reduce the length of deadlines. I am not sure what the reluctance is about. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Sadly, it has been suggested several times to the project that they reduce the length of deadlines. I am not sure what the reluctance is about. I attributed it once to the Medieval timescale. I don't think it was accepted formally. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
I just got a WU from batch 891. On its first attempt it spend a year with No response (jan 2021-jan 2022). 5 attempts are allocated to this batch. One could expect 5 years in vain in worst case scenario On its second attempt it errored in seconds (I guess missing 64bit libraries). On its 3rd attempt on of my machines got it and it's been computing fine. I'm not waiting someone to tell me that this batch is closed and I should abort. While climate change is accelerating keeping the one year deadline period is a kind of climate change denial. I mean the community here has been asking for shortened period for ages already. I mean how hard that is to be changed and why are we ignored even on the most common sense suggestions? I even have a ghost task in progress from Jan 2014 with a deadline in Jul 2023 - so I'm close. |
Send message Joined: 31 Aug 04 Posts: 10 Credit: 6,828,623 RAC: 15,684 |
Somewhat like the OP, I managed to lose the VM on my windows 10 machine and lost 5 In-Progress tasks. Is there a way to re-download the tasks so I can start working on them again? I don't care if I lose any pending work on them and have to start from the beginning, I just want to start working again. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,004,017 RAC: 21,574 |
Somewhat like the OP, I managed to lose the VM on my windows 10 machine and lost 5 In-Progress tasks. Is there a way to re-download the tasks so I can start working on them again? I don't care if I lose any pending work on them and have to start from the beginning, I just want to start working again.If there is a way, none of the mods have discovered it yet :( |
Send message Joined: 31 Aug 04 Posts: 10 Credit: 6,828,623 RAC: 15,684 |
I was afraid of that. Thanks for the quick response! |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
I am assuming that these incomplete tasks will now sit on the server until there expiry date which is November this year. Correct. Hope that nobody needs the results in the meantime. That's why they submit so many thousands of them. A ton get chewed up, an hour at a time, by machines without the right libraries. Surely with todays faster computers the "expiry date" should be more in the region of 3 months rather than 10 to 12 that seems to be standard. Maybe 4-6 months? ;) Winter is a rough time in my office for compute if I still have stuff going - I can get 6-8h/day, but often only on one computer, and I try very hard not to run compute tasks on generator. But I'm weird and I know it. The upside of this error is that computer now has Ubuntu installed as the OS which has so far proved to be more stable than the VM which kept "freezing". Yeah, Windows sucks. The initial downside is that BOINC then decided that the estimated completion time for the four downloaded tasks was 64days! That was six days ago - and those tasks are now nearing 50% completion with a small reduction in remaining time to 35days. Hopefully when the first of these tasks finishes the times will be more sensible. Beats the 105 days estimated for the N216 tasks. That's default performance, just... is a thing, run the benchmark before you add the project if you care. I've got half my tasks estimated like that, it doesn't impact performance. Somewhat like the OP, I managed to lose the VM on my windows 10 machine and lost 5 In-Progress tasks. Is there a way to re-download the tasks so I can start working on them again? I don't care if I lose any pending work on them and have to start from the beginning, I just want to start working again. Nothing I know of either. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Is there a way to re-download the tasks so I can start working on them again? The way that BOINC works, is that a computer gets one chance at it. If it fails, for any of many reasons, then it's allocated to the next computer waiting for work. |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,700,823 RAC: 9,977 |
The way that BOINC works, is that a computer gets one chance at it.Tasks which have actually failed, and been reported as failures, can't ever be recovered from the server. But BOINC does actually have a mechanism to 'resend lost tasks' - tasks which had previously been issued, but have vanished into cyberspace without trace. Tasks can only be resent to the exact same computer which lost them in the first place - in particular, a computer with the same HostID as the original. That's possible in the case of the vanished VM, but it requires you to negotiate some extra security hurdles. And the resending of lost tasks has to be deliberately switched on by the administrators of the project's servers. CPDN may feel that it's a complication too far. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,004,017 RAC: 21,574 |
Thanks Richard. That is a useful piece of information. I guess if I had read through all the documentation on the server side I would have seen that though whether I would have retained the information is another matter! |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,700,823 RAC: 9,977 |
Add this to your reference list! https://boinc.berkeley.edu/trac/wiki/ProjectOptions#Jobretransmission |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
The initial downside is that BOINC then decided that the estimated completion time for the four downloaded tasks was 64days! That was six days ago - and those tasks are now nearing 50% completion with a small reduction in remaining time to 35days. Hopefully when the first of these tasks finishes the times will be more sensible. In the Boinc client display is a command line with several drop-down menus. Under Tools is an item, Run CPU benchmarks. Would this not help immediately for future tasks? |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,004,017 RAC: 21,574 |
In the Boinc client display is a command line with several drop-down menus. Under Tools is an item, Run CPU benchmarks. Would this not help immediately for future tasks?Yes, it should do. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
In the Boinc client display is a command line with several drop-down menus. Under Tools is an item, Run CPU benchmarks. Would this not help immediately for future tasks? It will only apply to new tasks downloaded after that point. It doesn't change the estimate for anything already downloaded, even if you've not started the job. ... not that they have any reflection on reality anyway if you have a lot of CPDN tasks going. They fight for memory and cache and run rather slower than predicted. I've also noticed that the absolute number of instructions per second retired will drop if you get too many of them going on a single CPU. I'm running 8 tasks on my 12C/24T 3900Xs right now because that's where performance seems to peak for the system. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
[quote]I am assuming that these incomplete tasks will now sit on the server until there expiry date which is November this year. I vaguely remember somebody saying that if you detatch from the project and then re-attatch that the task gets marked as abandoned and will therefore be issued sooner than the twelve month cutoff date. But I may be wrong :(. |
Send message Joined: 12 Apr 21 Posts: 317 Credit: 14,819,420 RAC: 19,777 |
rfbrooks, If you want to release those tasks, you can, you just won't be able to get them back yourself (see end of Richard's post). Alan is right, if you detach from the project the tasks will get marked as Abandoned and be released to others. To do this you need to install a new BOINC instance, making sure that the Name, CPU, and Operating System stats are identical to the one that was on the lost VM. Attach CPDN to the new BOINC. Make sure that you do this on a different date from the Last Contact date of the lost BOINC, remembering that time is in UTC (by now this point shouldn't be an issue though). Once you do that, use the Merge Computers by Name option from the Your Computers page on the website. This will merge the lost and new BOINC IDs. Once the merge happens, detach the project from BOINC, which will mark the tasks as Abandoned and release them to others. You can confirm this by checking the Tasks page. Wait until the next day from the Last Contact and reattach the project. Perform the merge again as reattaching will get you a new host ID. Wait for new tasks. As a Windows user, I highly recommend you look into WSL2 instead of traditional VMs for BOINC. It uses a lot less resources and is more stable. Unless, of course, you have other reasons to keep VBox. WSL2 and VBox don't work together well, if at all, so you'd have to switch. |
Send message Joined: 31 Aug 04 Posts: 10 Credit: 6,828,623 RAC: 15,684 |
Thanks for the info, AndreyOR. I was out-of-town for the past week so I'm just getting back to it now. I'll let you know if I run into any issues. BTW - I'm running Hyper-V Manager in Windows, not VBox. I'll look at WSL2 and see if that works for me. |
Send message Joined: 12 Apr 21 Posts: 317 Credit: 14,819,420 RAC: 19,777 |
I assumed you were using VBox because it said on the site that you had it installed. How did you loose a Hyper-V VM? Was it a case of a disappearing VM hard disk or something else? I used to use Hyper-V (before learning about WSL2) and had VHDs disappear. |
©2024 cpdn.org