climateprediction.net (CPDN) home page
Thread 'Model uploaded/finished/reported, but still in progress on the web'

Thread 'Model uploaded/finished/reported, but still in progress on the web'

Message boards : Number crunching : Model uploaded/finished/reported, but still in progress on the web
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,706,848
RAC: 5,644
Message 54898 - Posted: 9 Oct 2016, 20:20:45 UTC

Hi folks,

This wah2_eu25 model finished successfully on 7th October, all zips are uploaded, trickles sent and present on the web, and task reported as finished (according to logs). Nevertheless, 2 days later it still has a status "in progress" on the web. Any ideas why it is like that and whether I can do something to fix it?

Cheers
ID: 54898 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,884,997
RAC: 4,577
Message 54899 - Posted: 9 Oct 2016, 22:05:54 UTC

I've had a couple like that, including this wah2_eu25, for which the log showed all the correct uploads and the reporting exchange. I assumed that something was waiting server-side to arrive in the right place but it never seems to have happened.
ID: 54899 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,706,848
RAC: 5,644
Message 54902 - Posted: 10 Oct 2016, 13:42:32 UTC - in response to Message 54899.  

I wonder whether there is a script cleaning these ghosts (as well as few other type of ghosts)? Then we can see actually how many are in progress and not just ghosts. But I guess this is for the wish list section.
ID: 54902 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 54903 - Posted: 10 Oct 2016, 14:15:00 UTC - in response to Message 54902.  

I think the number of “ghosts” is probably quite high. Some caused by equipment failure like hard drive crashes, other by people who just lose interest in the project and uninstall Boinc without aborting the WU’s on their computer first. These dead WU’s just sit and wait for their deadlines to arrive. Shorter deadlines would help as the deadlines on most CP projects is about 11 months.
ID: 54903 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,706,848
RAC: 5,644
Message 54904 - Posted: 10 Oct 2016, 15:53:33 UTC - in response to Message 54903.  

Jim,
one of my ghosts is with a deadline July 2023 (yes 2023) from a computer long gone. Two are linked to a computer that does not crunch and I have no longer access to abort. And the last is the wah_eu25 I mentioned - probably server side issue. 4 ghosts from me only and I can't do anything - perhaps few can.
ID: 54904 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54905 - Posted: 10 Oct 2016, 19:33:13 UTC

Before this thread goes too far off on a tangent, (or any other sort of transport :) ), the scheduler has been restarted.
Has this made a difference to those models "not reporting"?
(It may require a Retry to find out.)

ID: 54905 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,706,848
RAC: 5,644
Message 54906 - Posted: 10 Oct 2016, 21:00:12 UTC - in response to Message 54905.  
Last modified: 10 Oct 2016, 21:00:30 UTC

Les,
my wah2_eu25 reported on 7th. So I could only hit Projects>CPDN>Update, no Retry though. Am I getting it right?
ID: 54906 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54907 - Posted: 10 Oct 2016, 21:23:54 UTC - in response to Message 54906.  
Last modified: 10 Oct 2016, 21:34:34 UTC

Hi Bernard

That's want we need to know.
It's "reported", but the server doesn't show this.
So, plane B. (Whatever that is.)

PS
In the previous post, I meant Update, not Retry. <sigh>

edit 2
And just to be sure that I've got this right, I mean "Do an Update now", as the scheduler was restarted a few hours ago.
ID: 54907 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54908 - Posted: 10 Oct 2016, 22:18:32 UTC

Next thought:

Has this task also disappeared from the Tasks tab?
If the "Reporting" went OK at your end, then it should have, along with all of the entries about it in the client_state.xml file.

So an Update may not do anything.
But I can't think of anything else to try.
The project people may need to try a few things.


ID: 54908 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,884,997
RAC: 4,577
Message 54909 - Posted: 11 Oct 2016, 0:05:17 UTC - in response to Message 54908.  

Has this task also disappeared from the Tasks tab?
If the "Reporting" went OK at your end, then it should have, along with all of the entries about it in the client_state.xml file.

That was certainly the case for my latest of these stalled reports: everything looked fine but the server-side never caught up. No BSODs or Windows updates logged around the time - so no unexpected reboots.

The project got the Zips and the credits were awarded, so I just put it down as a server glitch, though from what other people report here it might be a little more than that ...
ID: 54909 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,706,848
RAC: 5,644
Message 54910 - Posted: 11 Oct 2016, 5:50:42 UTC - in response to Message 54908.  

Hi Les,

I did hit Update right after you posted about the scheduler being restarted. May not have been clear from my previous post.

The task is no longer listed and I assume it has been gone from Tasks since 7th October. I first checked on 9th Oct.

No files left in the data directory. Client_state.xml does not have an entry about the wah2_eu25_80x6_202412.....model.

No restarts of BOINC and Win7 SP 1; BOINC version 7.6.9 (x64).

And here is what is in the log (I've deleted some checkpoints):
07/10/2016 11:55:35 | climateprediction.net | Started upload of wah2_eu25_80x6_202412_13_398_010534216_1_13.zip
07/10/2016 11:56:34 | climateprediction.net | Started upload of wah2_eu25_80x6_202412_13_398_010534216_1_14.zip
07/10/2016 11:57:40 | climateprediction.net | Finished upload of wah2_eu25_80x6_202412_13_398_010534216_1_13.zip
07/10/2016 11:58:43 | climateprediction.net | Finished upload of wah2_eu25_80x6_202412_13_398_010534216_1_14.zip
07/10/2016 12:02:42 | climateprediction.net | Started upload of wah2_mex50_kae6_193312_13_436_010681796_0_r799279980_9.zip
07/10/2016 12:03:22 | climateprediction.net | Finished upload of wah2_mex50_kae6_193312_13_436_010681796_0_r799279980_9.zip
07/10/2016 12:05:15 | climateprediction.net | [checkpoint] result wah2_eu25_bcih_206112_12_386_010488650_0 checkpointed
07/10/2016 12:08:27 | climateprediction.net | [checkpoint] result wah2_mex50_kae6_193312_13_436_010681796_0 checkpointed
07/10/2016 12:14:07 | climateprediction.net | Message from task: 0
07/10/2016 12:14:07 | climateprediction.net | Computation for task wah2_eu25_80x6_202412_13_398_010534216_1 finished
07/10/2016 12:14:07 | climateprediction.net | Starting task wah2_mex50_k9ep_193312_13_436_010680520_0
07/10/2016 12:55:30 | climateprediction.net | Sending scheduler request: To send trickle-up message.
07/10/2016 12:55:30 | climateprediction.net | Reporting 1 completed tasks
07/10/2016 12:55:30 | climateprediction.net | Requesting new tasks for CPU
07/10/2016 12:56:24 | climateprediction.net | Scheduler request completed: got 0 new tasks
07/10/2016 12:56:24 | climateprediction.net | Project has no tasks available
ID: 54910 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54911 - Posted: 11 Oct 2016, 6:44:04 UTC - in response to Message 54910.  

Hi Bernard

Thanks for that list. It confirms a few thoughts.

I sent another email a bit over an hour ago.
Now for the long wait for an answer.

ID: 54911 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 54912 - Posted: 11 Oct 2016, 16:03:08 UTC

Hi, this task,

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19976711 which completed today is still showing as in progress on the tasks page for it's computer. I have tried clicking on update but it makes no difference. Has gone from tasks list on computer, task folder gone and credit seems what I would expect.
ID: 54912 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54924 - Posted: 12 Oct 2016, 20:02:39 UTC

Another puzzle which may not be solved.
There are no errors in the server logs, so no idea what went wrong.

If it happens again, please post about it.

ID: 54924 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,706,848
RAC: 5,644
Message 54929 - Posted: 13 Oct 2016, 6:45:42 UTC - in response to Message 54924.  

Thanks Les.
ID: 54929 · Report as offensive     Reply Quote
Albert H.

Send message
Joined: 18 Feb 06
Posts: 73
Credit: 62,624,562
RAC: 39,997
Message 54930 - Posted: 13 Oct 2016, 9:24:41 UTC

Here is one of mine, long time finished but still in progress
wah2_pnw25_z2a4_199512_24_406_010580849_2
ID: 54930 · Report as offensive     Reply Quote
Hona

Send message
Joined: 7 Nov 04
Posts: 1
Credit: 1,514,851
RAC: 0
Message 54951 - Posted: 17 Oct 2016, 21:14:26 UTC - in response to Message 54930.  

+1
This one is from Aug/10/2016
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19855201
ID: 54951 · Report as offensive     Reply Quote
Cameron

Send message
Joined: 8 Dec 05
Posts: 1
Credit: 254,581
RAC: 0
Message 54964 - Posted: 20 Oct 2016, 4:41:44 UTC

Well I'm in the Progress of crunching wah2_eu25_b4o2_195912_12_386_010478483_2
Task progress here
(90% done) but various stat sites still haven't added my subsequent trickles (after trickle 4 (22-Sept-2016).

BOINC/CPDN show the correct in progress credit value.

Wondering if it's related somehow.
ID: 54964 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54965 - Posted: 20 Oct 2016, 5:28:58 UTC - in response to Message 54964.  

Hi Cameron

The credit scripts only run once per week these days.
News and Announcements 2: Here

And the scripts that create the Export files haven't run for a few weeks now.
Credits

ID: 54965 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,706,848
RAC: 5,644
Message 54966 - Posted: 20 Oct 2016, 5:39:18 UTC - in response to Message 54964.  

Hi Cameron,

it would be interesting if such relation exist. For example, the model in the previous post finished in August, and since that external stats have been updated several times while the model has been in progress. However your current wah2_25 is still being crunched add would need to be added to the group reported here if it finishes (all zips, trickles uploaded, your BOINC software reports the task as done, it is no longer in task list, nor in data folders) but on the web it is still in progress.

As for the external sites CPDN exports credits and it hasn't been updated since 26/9 the is discussed here and we wait for a fix.

ID: 54966 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Model uploaded/finished/reported, but still in progress on the web

©2024 cpdn.org