climateprediction.net (CPDN) home page
Thread 'Lost tasks'

Thread 'Lost tasks'

Message boards : Number crunching : Lost tasks
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 31,029,761
RAC: 14,491
Message 64914 - Posted: 6 Jan 2022, 23:26:43 UTC

Due to a moment of absolute madness last week I managed to totally delete the Win10 operating system on my main machine and with it the VM installations running Ubuntu - and hence 6 recently started tasks!! (:-. I am assuming that these incomplete tasks will now sit on the server until there expiry date which is November this year. Hope that nobody needs the results in the meantime. Surely with todays faster computers the "expiry date" should be more in the region of 3 months rather than 10 to 12 that seems to be standard.
The upside of this error is that computer now has Ubuntu installed as the OS which has so far proved to be more stable than the VM which kept "freezing". The initial downside is that BOINC then decided that the estimated completion time for the four downloaded tasks was 64days! That was six days ago - and those tasks are now nearing 50% completion with a small reduction in remaining time to 35days. Hopefully when the first of these tasks finishes the times will be more sensible.
ID: 64914 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 64915 - Posted: 6 Jan 2022, 23:42:58 UTC - in response to Message 64914.  
Last modified: 6 Jan 2022, 23:43:22 UTC

I agree, one year seems way too long and three months much more reasonable.

As for the time estimation for the tasks, run a benchmark, and at least the next tasks that are downloaded with have a much better guess at time to complete.
ID: 64915 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 64921 - Posted: 7 Jan 2022, 15:15:33 UTC

Sadly, it has been suggested several times to the project that they reduce the length of deadlines. I am not sure what the reluctance is about.
ID: 64921 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 64924 - Posted: 7 Jan 2022, 21:07:59 UTC - in response to Message 64921.  

Sadly, it has been suggested several times to the project that they reduce the length of deadlines. I am not sure what the reluctance is about.

I attributed it once to the Medieval timescale. I don't think it was accepted formally.
ID: 64924 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 64969 - Posted: 14 Jan 2022, 18:27:26 UTC

I just got a WU from batch 891. On its first attempt it spend a year with No response (jan 2021-jan 2022). 5 attempts are allocated to this batch. One could expect 5 years in vain in worst case scenario On its second attempt it errored in seconds (I guess missing 64bit libraries). On its 3rd attempt on of my machines got it and it's been computing fine. I'm not waiting someone to tell me that this batch is closed and I should abort. While climate change is accelerating keeping the one year deadline period is a kind of climate change denial. I mean the community here has been asking for shortened period for ages already. I mean how hard that is to be changed and why are we ignored even on the most common sense suggestions?

I even have a ghost task in progress from Jan 2014 with a deadline in Jul 2023 - so I'm close.
ID: 64969 · Report as offensive     Reply Quote
rfbrooks

Send message
Joined: 31 Aug 04
Posts: 10
Credit: 6,886,671
RAC: 15,722
Message 65742 - Posted: 3 Aug 2022, 18:11:18 UTC

Somewhat like the OP, I managed to lose the VM on my windows 10 machine and lost 5 In-Progress tasks. Is there a way to re-download the tasks so I can start working on them again? I don't care if I lose any pending work on them and have to start from the beginning, I just want to start working again.
ID: 65742 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 65743 - Posted: 3 Aug 2022, 18:19:54 UTC - in response to Message 65742.  

Somewhat like the OP, I managed to lose the VM on my windows 10 machine and lost 5 In-Progress tasks. Is there a way to re-download the tasks so I can start working on them again? I don't care if I lose any pending work on them and have to start from the beginning, I just want to start working again.
If there is a way, none of the mods have discovered it yet :(
ID: 65743 · Report as offensive     Reply Quote
rfbrooks

Send message
Joined: 31 Aug 04
Posts: 10
Credit: 6,886,671
RAC: 15,722
Message 65744 - Posted: 3 Aug 2022, 19:18:30 UTC - in response to Message 65743.  

I was afraid of that. Thanks for the quick response!
ID: 65744 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 65745 - Posted: 4 Aug 2022, 2:28:14 UTC - in response to Message 64914.  

I am assuming that these incomplete tasks will now sit on the server until there expiry date which is November this year.


Correct.

Hope that nobody needs the results in the meantime.


That's why they submit so many thousands of them. A ton get chewed up, an hour at a time, by machines without the right libraries.

Surely with todays faster computers the "expiry date" should be more in the region of 3 months rather than 10 to 12 that seems to be standard.


Maybe 4-6 months? ;) Winter is a rough time in my office for compute if I still have stuff going - I can get 6-8h/day, but often only on one computer, and I try very hard not to run compute tasks on generator. But I'm weird and I know it.

The upside of this error is that computer now has Ubuntu installed as the OS which has so far proved to be more stable than the VM which kept "freezing".


Yeah, Windows sucks.

The initial downside is that BOINC then decided that the estimated completion time for the four downloaded tasks was 64days! That was six days ago - and those tasks are now nearing 50% completion with a small reduction in remaining time to 35days. Hopefully when the first of these tasks finishes the times will be more sensible.


Beats the 105 days estimated for the N216 tasks. That's default performance, just... is a thing, run the benchmark before you add the project if you care. I've got half my tasks estimated like that, it doesn't impact performance.

Somewhat like the OP, I managed to lose the VM on my windows 10 machine and lost 5 In-Progress tasks. Is there a way to re-download the tasks so I can start working on them again? I don't care if I lose any pending work on them and have to start from the beginning, I just want to start working again.


Nothing I know of either.
ID: 65745 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 65746 - Posted: 4 Aug 2022, 3:55:43 UTC - in response to Message 65742.  

Is there a way to re-download the tasks so I can start working on them again?


The way that BOINC works, is that a computer gets one chance at it.
If it fails, for any of many reasons, then it's allocated to the next computer waiting for work.
ID: 65746 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,716,561
RAC: 8,355
Message 65747 - Posted: 4 Aug 2022, 6:56:57 UTC - in response to Message 65746.  

The way that BOINC works, is that a computer gets one chance at it.
If it fails, for any of many reasons, then it's allocated to the next computer waiting for work.
Tasks which have actually failed, and been reported as failures, can't ever be recovered from the server.

But BOINC does actually have a mechanism to 'resend lost tasks' - tasks which had previously been issued, but have vanished into cyberspace without trace.

Tasks can only be resent to the exact same computer which lost them in the first place - in particular, a computer with the same HostID as the original. That's possible in the case of the vanished VM, but it requires you to negotiate some extra security hurdles.

And the resending of lost tasks has to be deliberately switched on by the administrators of the project's servers. CPDN may feel that it's a complication too far.
ID: 65747 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 65748 - Posted: 4 Aug 2022, 9:43:59 UTC - in response to Message 65747.  

Thanks Richard. That is a useful piece of information. I guess if I had read through all the documentation on the server side I would have seen that though whether I would have retained the information is another matter!
ID: 65748 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,716,561
RAC: 8,355
Message 65749 - Posted: 4 Aug 2022, 9:56:32 UTC - in response to Message 65748.  

ID: 65749 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 65750 - Posted: 4 Aug 2022, 10:49:42 UTC - in response to Message 64914.  

The initial downside is that BOINC then decided that the estimated completion time for the four downloaded tasks was 64days! That was six days ago - and those tasks are now nearing 50% completion with a small reduction in remaining time to 35days. Hopefully when the first of these tasks finishes the times will be more sensible.


In the Boinc client display is a command line with several drop-down menus. Under Tools is an item, Run CPU benchmarks. Would this not help immediately for future tasks?
ID: 65750 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 65751 - Posted: 4 Aug 2022, 12:36:47 UTC

In the Boinc client display is a command line with several drop-down menus. Under Tools is an item, Run CPU benchmarks. Would this not help immediately for future tasks?
Yes, it should do.
ID: 65751 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 65752 - Posted: 4 Aug 2022, 17:12:13 UTC - in response to Message 65750.  
Last modified: 4 Aug 2022, 17:13:24 UTC

In the Boinc client display is a command line with several drop-down menus. Under Tools is an item, Run CPU benchmarks. Would this not help immediately for future tasks?


It will only apply to new tasks downloaded after that point. It doesn't change the estimate for anything already downloaded, even if you've not started the job.

... not that they have any reflection on reality anyway if you have a lot of CPDN tasks going. They fight for memory and cache and run rather slower than predicted. I've also noticed that the absolute number of instructions per second retired will drop if you get too many of them going on a single CPU. I'm running 8 tasks on my 12C/24T 3900Xs right now because that's where performance seems to peak for the system.
ID: 65752 · Report as offensive     Reply Quote
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 31,029,761
RAC: 14,491
Message 65753 - Posted: 4 Aug 2022, 22:41:23 UTC - in response to Message 65745.  

[quote]I am assuming that these incomplete tasks will now sit on the server until there expiry date which is November this year.


I vaguely remember somebody saying that if you detatch from the project and then re-attatch that the task gets marked as abandoned and will therefore be issued sooner than the twelve month cutoff date. But I may be wrong :(.
ID: 65753 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 317
Credit: 14,884,880
RAC: 19,188
Message 65754 - Posted: 5 Aug 2022, 6:47:49 UTC
Last modified: 5 Aug 2022, 6:48:14 UTC

rfbrooks,
If you want to release those tasks, you can, you just won't be able to get them back yourself (see end of Richard's post). Alan is right, if you detach from the project the tasks will get marked as Abandoned and be released to others.

To do this you need to install a new BOINC instance, making sure that the Name, CPU, and Operating System stats are identical to the one that was on the lost VM. Attach CPDN to the new BOINC. Make sure that you do this on a different date from the Last Contact date of the lost BOINC, remembering that time is in UTC (by now this point shouldn't be an issue though). Once you do that, use the Merge Computers by Name option from the Your Computers page on the website. This will merge the lost and new BOINC IDs. Once the merge happens, detach the project from BOINC, which will mark the tasks as Abandoned and release them to others. You can confirm this by checking the Tasks page. Wait until the next day from the Last Contact and reattach the project. Perform the merge again as reattaching will get you a new host ID. Wait for new tasks.

As a Windows user, I highly recommend you look into WSL2 instead of traditional VMs for BOINC. It uses a lot less resources and is more stable. Unless, of course, you have other reasons to keep VBox. WSL2 and VBox don't work together well, if at all, so you'd have to switch.
ID: 65754 · Report as offensive     Reply Quote
rfbrooks

Send message
Joined: 31 Aug 04
Posts: 10
Credit: 6,886,671
RAC: 15,722
Message 65807 - Posted: 11 Aug 2022, 22:12:54 UTC - in response to Message 65754.  

Thanks for the info, AndreyOR. I was out-of-town for the past week so I'm just getting back to it now. I'll let you know if I run into any issues.

BTW - I'm running Hyper-V Manager in Windows, not VBox. I'll look at WSL2 and see if that works for me.
ID: 65807 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 317
Credit: 14,884,880
RAC: 19,188
Message 65812 - Posted: 12 Aug 2022, 7:10:23 UTC - in response to Message 65807.  

I assumed you were using VBox because it said on the site that you had it installed. How did you loose a Hyper-V VM? Was it a case of a disappearing VM hard disk or something else? I used to use Hyper-V (before learning about WSL2) and had VHDs disappear.
ID: 65812 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Lost tasks

©2024 cpdn.org