climateprediction.net home page
WU won't upload

WU won't upload

Message boards : Number crunching : WU won't upload
Message board moderation

To post messages, you must log in.

AuthorMessage
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,394,685
RAC: 2,212
Message 61499 - Posted: 8 Nov 2019, 21:56:24 UTC

BOINC has been retrying to upload the results on https://www.cpdn.org/result.php?resultid=21710685 for 10 days now.

Restarted the BOINC install but every other thing to try (reset project, detach/reattach) will lose the WU.

The WU is not past it's deadline but sat idle during the summer while the computer was down for the hot months.
ID: 61499 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61502 - Posted: 9 Nov 2019, 1:45:59 UTC - in response to Message 61499.  

There's several things here.

1: The so called deadline is just a number to keep BOINC happy. It's NOT when the task is required. Which is: ASAP

2: That is a hadcm3s.
Up until recently, when the problem was found and fixed, that type of model had a fault in the trickle_up code, whereby the 1st trickle was generated with the correct info, but all subsequent trickles were identical.
So the server code would get the 2nd (and all others) trickle, compare it with what it already had, then discard it as a duplicate.
So people only got credit for one trickle for this type of model.
And, as you've already received that, you won't be getting more credit.

3. And the big one: Those models were originally issued way back on 20 Dec 2017.
Somehow they must have been re-issued.
And the associated files for them most likely disappeared during the big server problems last year.

So you may as well Abort whatever it is you have on your computer for that model.
ID: 61502 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61504 - Posted: 9 Nov 2019, 4:58:53 UTC

Or
4. You're running Windows computers, and the Windows app for the "short" models was withdrawn at the time of the upgrade, a couple of months ago.
That model type is only available for Linux now, so that's why the server has no idea what your computer is talking about.
ID: 61504 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,394,685
RAC: 2,212
Message 61586 - Posted: 21 Nov 2019, 7:59:30 UTC - in response to Message 61504.  

WU was aborted.

Thanks for the answer.
ID: 61586 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,394,685
RAC: 2,212
Message 61594 - Posted: 22 Nov 2019, 11:44:46 UTC - in response to Message 61502.  


1: The so called deadline is just a number to keep BOINC happy. It's NOT when the task is required. Which is: ASAP
.


If required return is ASAP then maybe lessen the deadline to 10-20 days from over 200 days (IIRC).
People actually USE the deadline to make decisions when manually prioritizing WU's.

I would have let the WU complete or aborted it before shutting that machine down for 3 months during the summer (climate crisis: a/c to cool servers in a home when we're in the hottest global streak of years on record is just species' suicide).
ID: 61594 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,394,685
RAC: 2,212
Message 61607 - Posted: 25 Nov 2019, 15:10:05 UTC - in response to Message 61502.  


2: That is a hadcm3s.
Up until recently, when the problem was found and fixed, that type of model had a fault in the trickle_up code, whereby the 1st trickle was generated with the correct info, but all subsequent trickles were identical.
So the server code would get the 2nd (and all others) trickle, compare it with what it already had, then discard it as a duplicate.
So people only got credit for one trickle for this type of model.
And, as you've already received that, you won't be getting more credit.



After aborting, likely because of the above error and subsequent patch, the WU is marked completed.

21710685	11381390	15 Jun 2019, 6:10:47 UTC	21 Nov 2019, 8:00:36 UTC	Completed	779,944.57	779,508.50	3,111.26	UK Met Office HadCM3 short v8.34 windows_intelx86

ID: 61607 · Report as offensive     Reply Quote
Dark Angel

Send message
Joined: 31 May 18
Posts: 53
Credit: 4,725,987
RAC: 9,174
Message 61611 - Posted: 26 Nov 2019, 6:16:10 UTC - in response to Message 61607.  

I am also having trouble uploading files right now.

Tue 26 Nov 2019 17:12:03 AEDT | climateprediction.net | Started upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_6.zip
Tue 26 Nov 2019 17:12:15 AEDT | climateprediction.net | Started upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_restart.zip
Tue 26 Nov 2019 17:12:17 AEDT | climateprediction.net | Started upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_out.zip
Tue 26 Nov 2019 17:12:19 AEDT | climateprediction.net | Started upload of hadam4_a0g5_200710_6_848_011923756_0_r1972935599_5.zip
Tue 26 Nov 2019 17:12:20 AEDT | climateprediction.net | Started upload of hadam4_a2e0_201510_6_848_011926271_0_r129336176_3.zip
Tue 26 Nov 2019 17:12:37 AEDT | | Project communication failed: attempting access to reference site
Tue 26 Nov 2019 17:12:37 AEDT | climateprediction.net | Temporarily failed upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_6.zip: connect() failed
Tue 26 Nov 2019 17:12:37 AEDT | climateprediction.net | Backing off 05:25:53 on upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_6.zip
Tue 26 Nov 2019 17:12:40 AEDT | | Internet access OK - project servers may be temporarily down.
Tue 26 Nov 2019 17:12:48 AEDT | | Project communication failed: attempting access to reference site
Tue 26 Nov 2019 17:12:48 AEDT | climateprediction.net | Temporarily failed upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_restart.zip: connect() failed
Tue 26 Nov 2019 17:12:48 AEDT | climateprediction.net | Backing off 04:07:48 on upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_restart.zip
Tue 26 Nov 2019 17:12:49 AEDT | | Internet access OK - project servers may be temporarily down.
Tue 26 Nov 2019 17:12:50 AEDT | | Project communication failed: attempting access to reference site
Tue 26 Nov 2019 17:12:50 AEDT | climateprediction.net | Temporarily failed upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_out.zip: connect() failed
Tue 26 Nov 2019 17:12:50 AEDT | climateprediction.net | Backing off 03:48:05 on upload of hadam4_a1q8_201310_6_848_011925415_0_r967725905_out.zip
Tue 26 Nov 2019 17:12:52 AEDT | | Internet access OK - project servers may be temporarily down.
Tue 26 Nov 2019 17:12:52 AEDT | climateprediction.net | Temporarily failed upload of hadam4_a0g5_200710_6_848_011923756_0_r1972935599_5.zip: connect() failed
Tue 26 Nov 2019 17:12:52 AEDT | climateprediction.net | Backing off 00:29:43 on upload of hadam4_a0g5_200710_6_848_011923756_0_r1972935599_5.zip
Tue 26 Nov 2019 17:12:53 AEDT | | Project communication failed: attempting access to reference site
Tue 26 Nov 2019 17:12:53 AEDT | climateprediction.net | Temporarily failed upload of hadam4_a2e0_201510_6_848_011926271_0_r129336176_3.zip: connect() failed
Tue 26 Nov 2019 17:12:53 AEDT | climateprediction.net | Backing off 00:11:03 on upload of hadam4_a2e0_201510_6_848_011926271_0_r129336176_3.zip
Tue 26 Nov 2019 17:12:54 AEDT | | Internet access OK - project servers may be temporarily down.

These are noted as hadam4 units, not the 3c ones mentioned earlier.
Work unit properties:

Application UK Met Office HadAM4 at N144 resolution 8.09
Name hadam4_a1q8_201310_6_848_011925415
State Uploading
Received Wed 06 Nov 2019 00:18:27 AEDT
Report deadline Sun 18 Oct 2020 05:38:26 AEDT
Estimated computation size 2,417,459 GFLOPs
CPU time 4d 00:05:27
Elapsed time 4d 00:26:30
Executable hadam4_8.09_i686-pc-linux-gnu
ID: 61611 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61612 - Posted: 26 Nov 2019, 6:22:10 UTC

Yes it's been down most of the day.
I reported it a few hours ago,but it's only 6.20 AM there. :(
ID: 61612 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,966,742
RAC: 21,869
Message 61613 - Posted: 26 Nov 2019, 7:09:20 UTC - in response to Message 61612.  
Last modified: 26 Nov 2019, 7:11:50 UTC

Yes it's been down most of the day.
I reported it a few hours ago,but it's only 6.20 AM there. :(


And I notice that my upload to testing site is getting the same result.

Edit: I have just checked and it is going to upload 11, the same server that is down for the main site. A lot of main site uploads go to different places than testing branch.
ID: 61613 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,966,742
RAC: 21,869
Message 61614 - Posted: 26 Nov 2019, 14:46:29 UTC
Last modified: 26 Nov 2019, 16:12:47 UTC

Andy says this is fixed but my testing site upload still isn't going. I have informed project though it may be the usual issue of the server getting hammered after a period of not working.

Edit:It has now cleared.
ID: 61614 · Report as offensive     Reply Quote

Message boards : Number crunching : WU won't upload

©2024 cpdn.org