Message boards :
Number crunching :
Ghost work units?
Message board moderation
Author | Message |
---|---|
Send message Joined: 16 Oct 04 Posts: 4 Credit: 2,012,373 RAC: 0 |
Noticed computer tasks which are not in my work que (computer 1420079), anybody has an idea how/why? Task Workunit 20298856 10955173 22 Feb 2017, 17:40:15 UTC 4 Feb 2018, 20298576 10884065 22 Feb 2017, 17:38:32 UTC 4 Feb 2018, 20298733 10924236 22 Feb 2017, 17:37:41 UTC 4 Feb 2018, 20295494 10885567 22 Feb 2017, 17:36:48 UTC 4 Feb 2018, 20278213 10876852 22 Feb 2017, 17:36:48 UTC 4 Feb 2018, 20292607 10964113 22 Feb 2017, 17:36:48 UTC 4 Feb 2018, Terrible T |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Probably "phantom" tasks. This happens now and then when there's some sort of overload on the Oxford servers. |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
See the thread below (Orphaned Work Units). I have a number of WU's that CPDN believes are active but are not visible in my BOINC application. My best explanation is that there is a BOINC failure processing mode that occurs on my Intel I7 machine where BOINC processing fails, the data and WU informaton is deleted from my machine, but there is no error report to CPDN.... This results in CPDN continuing to believe the WU is being processed, when in fact it is not...and the WU is then not resent to another user until the 1 year time limit on processing has passed. Art Masson |
Send message Joined: 19 Sep 04 Posts: 92 Credit: 2,009,980 RAC: 298 |
And a few weeks ago when the trickles were missing, I had a WU that finished and reported with an OK from the servers (on the BOINC Manager Log), but stills shows as in progress... At least now all the trickles appear. Professor Desty Nova Researching Karma the Hard Way |
Send message Joined: 18 Jun 17 Posts: 18 Credit: 9,384,598 RAC: 45,264 |
Noticed that I have some ghost/phantom tasks (i.e. tasks showing up in the server as "in progress") but nothing on my PC. I guess I'll just have to let them expire in about 12 months time. Let me know if there is a way to recover or if not recycle these tasks back to other volunteers. Let me know whom I can pm the list of ghost tasks to be recycled, if needed. These are the tasks that start with wah2_eas25*, so likely to end up with errors from what I've seen on a few of them. Cheers. |
Send message Joined: 7 Aug 04 Posts: 2185 Credit: 64,822,615 RAC: 5,275 |
Noticed that I have some ghost/phantom tasks (i.e. tasks showing up in the server as "in progress") but nothing on my PC. If you detach the PC associated with these tasks, their status will go to "Abandoned" and the next task from that work unit will be ready to send out (assuming yours wasn't the last task in that work unit). You can then reattach. Edit...Do this after you have no cpdn tasks currently running. |
Send message Joined: 18 Jun 17 Posts: 18 Credit: 9,384,598 RAC: 45,264 |
Thanks but I've the impression that detaching the project will cause the task to be abandoned AND will only be recycled when the task expires in about a year for cpdn tasks. At least based on what I understand from the primegrid forum when I participated in their challenges. http://www.primegrid.com/forum_thread.php?id=10277&nowrap=true#163886 Also I read older thread that says the same thing: https://www.cpdn.org/forum_thread.php?id=8585#57815 From primegrid forum. Important reminders: |
Send message Joined: 7 Aug 04 Posts: 2185 Credit: 64,822,615 RAC: 5,275 |
Back in earlier days, the cpdn server sometimes had trouble keeping up, especially when it was running the weekly credit script. So,occasionally, if tasks were reported during that credit run, the completion status was not logged/stored correctly. For example, there were 4 tasks (marked Abandoned when I detached) that were sent to one of my computers on May 22 2020 that sent in all 4 trickles, https://www.cpdn.org/results.php?hostid=1492829&offset=160&show_names=0&state=0&appid=33 and reported to the server on May 27 that the tasks were completed and were a success. However, the server did not record the completion report and so those tasks were no longer on my PC, but were "In progress" according to the task status on the server. When sent to my computer, these tasks had a deadline of May 4 2021. On June 4th 2020 I detached that computer from climateprediction which is when the boinc server marked the stats as "Abandoned". Looking at one of those work units https://www.cpdn.org/workunit.php?wuid=12017682 you can see the next task from it was sent out on June 4 2020 to a computer that completed that task. So it did not wait until the deadline, the next task from that work unit was sent back out immediately after abandonment. If you do an advanced search on the number crunching forum going back with a search limit of "no limit" and keyword detach, or abandoned, you find some replies by WaterOakley, who is a sharp boinc user, recommending the same method for tasks that are listed by the server as in progress for a PC, but are not in the boinc manager task list for the PC. |
Send message Joined: 29 Oct 17 Posts: 1048 Credit: 16,391,077 RAC: 15,319 |
I had some hardware fail with a task running and had to rebuild/reinstall the OS. In this scenario there's nothing that can be done for the 'In Progress' task still showing on a machine that no longer exists. It's not possible to cancel the task via the web account. Which is somewhat irritating given the long timeout of the the hadley model batches. In practise though, CPDN will usually close a batch once they get >90% returns so late resends are not important to the project. |
Send message Joined: 18 Jun 17 Posts: 18 Credit: 9,384,598 RAC: 45,264 |
@geophi, thanks for the clarification. I'll detach the project. |
Send message Joined: 18 Jun 17 Posts: 18 Credit: 9,384,598 RAC: 45,264 |
Just completed the last WU this morning on this rig and detach the project. Doesn't seem to change the status from "in progress" to "abandoned" after re-attaching the project. Maybe wait for another day? Here is my rig https://www.cpdn.org/results.php?hostid=1542213. WU#12219305 is the ghost task. I detached and re-attached the cpdn twice this morning but no success without shutting down the boinc client. Just now, I did detach the project and then shutdown client and restarted the client and re-attaching the project but still the same. 7/30/2023 8:06:59 AM | climateprediction.net | Resetting project 7/30/2023 8:06:59 AM | climateprediction.net | Detaching from project 7/30/2023 8:07:28 AM | | Fetching configuration file from https://climateprediction.net/get_project_config.php 7/30/2023 8:07:52 AM | climateprediction.net | Fetching scheduler list 7/30/2023 8:07:58 AM | climateprediction.net | Master file download succeeded 7/30/2023 8:08:03 AM | climateprediction.net | Sending scheduler request: Project initialization. 7/30/2023 8:08:03 AM | climateprediction.net | Requesting new tasks for CPU and NVIDIA GPU 7/30/2023 8:08:04 AM | climateprediction.net | Scheduler request completed: got 0 new tasks 7/30/2023 8:08:04 AM | climateprediction.net | Project has no tasks available 7/30/2023 8:08:04 AM | climateprediction.net | Project requested delay of 3636 seconds 7/30/2023 9:08:21 AM | climateprediction.net | Resetting project 7/30/2023 9:08:21 AM | climateprediction.net | Detaching from project 7/30/2023 9:10:01 AM | | Fetching configuration file from https://climateprediction.net/get_project_config.php 7/30/2023 9:10:40 AM | | Project communication failed: attempting access to reference site 7/30/2023 9:10:41 AM | | Internet access OK - project servers may be temporarily down. 7/30/2023 9:10:44 AM | climateprediction.net | Fetching scheduler list 7/30/2023 9:10:48 AM | climateprediction.net | Master file download succeeded 7/30/2023 9:10:53 AM | climateprediction.net | Sending scheduler request: Project initialization. 7/30/2023 9:10:53 AM | climateprediction.net | Requesting new tasks for CPU and NVIDIA GPU 7/30/2023 9:10:55 AM | climateprediction.net | Scheduler request completed: got 0 new tasks 7/30/2023 9:10:55 AM | climateprediction.net | Project has no tasks available 7/30/2023 9:10:55 AM | climateprediction.net | Project requested delay of 3636 seconds |
Send message Joined: 29 Oct 17 Posts: 1048 Credit: 16,391,077 RAC: 15,319 |
There is a 'memory' I believe in the boinc server software where it retains info on past hosts in case they reconnect later. I had a similar issue some time ago I asked them about that was related. I'll try to get time with Andy this morning and ask him but I believe that's what's going on. You might need to detach and then wait a couple of days before reattaching. I'm not sure what the right time period is, the moderators might know. |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,690,033 RAC: 10,812 |
There is a 'memory' I believe in the boinc server software where it retains info on past hosts in case they reconnect later. I had a similar issue some time ago I asked them about that was related.Yes, when you re-attach to the project, the server searches the 'host' table and tries to find a match: if it finds one, it re-cycles the old HostID number to reduce database bloat. If it issues a new number instead, I expect that the old record won't be changed, and will remain 'fossilised' in the database with the ghost task(s) preserved. The user can recover the old HostID manually, but it's a bit fiddly: you have to stop BOINC, and edit the client_state.xml file. Proceed with extreme caution, using a plain-text editor only. DON'T do this if you have active tasks in progress or waiting to start. You need to change both the HostID itself - you can see the old one on this website - and the <rpc_seqno>. Make that value one larger than the "Number of times client has contacted server" shown on this website for the old HostID. Save the file, and restart BOINC. I can't guarantee that it will exorcise the ghosts, but it's worth a try while things are quiet. |
Send message Joined: 29 Oct 17 Posts: 1048 Credit: 16,391,077 RAC: 15,319 |
I have seriously messed up my boinc client editing the client_state.xml before now. Might be safer just to ignore the ghost task until it times out. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,962,600 RAC: 21,639 |
I have seriously messed up my boinc client editing the client_state.xml before now. Might be safer just to ignore the ghost task until it times out.I have only edited mine after first backing up! In the days of tasks that lasted for months if a task crashed due to a power outage, I would restore from a backup. Oh the fun! |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
My last standing ghost WU is now "over" after 9 years in the "In Progress" queue with a "Timed out - no response" status as of 19.07.2023 (got the WU back on 15.01.2014) https://www.cpdn.org/result.php?resultid=16272420 Finally my queue is clear :) |
©2024 cpdn.org