climateprediction.net (CPDN) home page
Thread 'Upload failures'

Thread 'Upload failures'

Message boards : Number crunching : Upload failures
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 · Next

AuthorMessage
mmonnin

Send message
Joined: 28 May 17
Posts: 49
Credit: 17,345,366
RAC: 7,398
Message 60966 - Posted: 21 Sep 2019, 12:40:05 UTC
Last modified: 21 Sep 2019, 12:40:14 UTC

I had an anz50 task stuck uploading the last 4 zips, a restart and out file. I think it happened around when the server went down. I aborted the uploads as the task had 20 trickles and then it changed to completed.
https://www.cpdn.org/result.php?resultid=21741910
ID: 60966 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 61003 - Posted: 26 Sep 2019, 6:00:41 UTC - in response to Message 60940.  

Hi folks,
I still have a pile of cam25's zips that I cannot upload. I still got this message

26/09/2019 08:13:39 | climateprediction.net | [checkpoint] result wah2_cam25_a0mi_200405_18_832_011891177_0 checkpointed
26/09/2019 08:15:00 | climateprediction.net | [http] HTTP_OP::libcurl_exec(): ca-bundle 'C:\Program Files\BOINC\ca-bundle.crt'
26/09/2019 08:15:00 | climateprediction.net | [http] HTTP_OP::libcurl_exec(): ca-bundle set
26/09/2019 08:15:02 | climateprediction.net | [http] [ID#15266] Info: Trying 158.97.9.11...
26/09/2019 08:15:23 | climateprediction.net | [http] [ID#15266] Info: connect to 158.97.9.11 port 80 failed: Timed out
26/09/2019 08:15:23 | climateprediction.net | [http] [ID#15266] Info: Failed to connect to upload6.cpdn.org port 80: Timed out
26/09/2019 08:15:23 | climateprediction.net | [http] [ID#15266] Info: Closing connection 4723
26/09/2019 08:15:23 | climateprediction.net | [http] HTTP error: Couldn't connect to server
26/09/2019 08:15:23 | climateprediction.net | Backing off 00:03:57 on upload of wah2_cam25_a0hp_200405_18_832_011891004_0_r1585838413_14.zip

This URL points to Mexico dzahui.cicese.mx and ping it gives me request timed out
ID: 61003 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 61004 - Posted: 26 Sep 2019, 6:39:04 UTC - in response to Message 61003.  

Me too. I wonder: is it worth continuing to process CAM work units at all? This always seems to happen with them.

Hi folks,
I still have a pile of cam25's zips that I cannot upload. I still got this message

26/09/2019 08:13:39 | climateprediction.net | [checkpoint] result wah2_cam25_a0mi_200405_18_832_011891177_0 checkpointed
26/09/2019 08:15:00 | climateprediction.net | [http] HTTP_OP::libcurl_exec(): ca-bundle 'C:\Program Files\BOINC\ca-bundle.crt'
26/09/2019 08:15:00 | climateprediction.net | [http] HTTP_OP::libcurl_exec(): ca-bundle set
26/09/2019 08:15:02 | climateprediction.net | [http] [ID#15266] Info: Trying 158.97.9.11...
26/09/2019 08:15:23 | climateprediction.net | [http] [ID#15266] Info: connect to 158.97.9.11 port 80 failed: Timed out
26/09/2019 08:15:23 | climateprediction.net | [http] [ID#15266] Info: Failed to connect to upload6.cpdn.org port 80: Timed out
26/09/2019 08:15:23 | climateprediction.net | [http] [ID#15266] Info: Closing connection 4723
26/09/2019 08:15:23 | climateprediction.net | [http] HTTP error: Couldn't connect to server
26/09/2019 08:15:23 | climateprediction.net | Backing off 00:03:57 on upload of wah2_cam25_a0hp_200405_18_832_011891004_0_r1585838413_14.zip

This URL points to Mexico dzahui.cicese.mx and ping it gives me request timed out
ID: 61004 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61005 - Posted: 26 Sep 2019, 6:51:21 UTC

Keep going.
We're talking to the research people there.
I'll point them to your posts.
ID: 61005 · Report as offensive     Reply Quote
Albert H.

Send message
Joined: 18 Feb 06
Posts: 73
Credit: 61,796,114
RAC: 46,169
Message 61006 - Posted: 26 Sep 2019, 7:09:03 UTC

I have also anz 50...794 and 719 and cam 25 ... 832 that cannot upload
Thanks for your action
ID: 61006 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 61007 - Posted: 26 Sep 2019, 8:57:29 UTC - in response to Message 61006.  

From Sarah at project.
as the machine was up I asked Andy to see if he could log in and take a look and as I have just received a status cake message to say the machine it back up, I hope this is now sorted.


Depending on the size of the backlog, there may well be transient upload errors till the logjam is cleared. If sorted, they will go in a few hours or so. (When servers have been down for days and nearly everyone has work it can then take a day for things to clear.

It would be good if someone could post to let us know if things are now working or not.
ID: 61007 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 61009 - Posted: 26 Sep 2019, 9:50:12 UTC - in response to Message 61007.  

It would be good if someone could post to let us know if things are now working or not.


Mine started to clear up now at 50% still more than 30 zips to upload. Thanks
ID: 61009 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 61011 - Posted: 26 Sep 2019, 10:15:25 UTC - in response to Message 61009.  

It would be good if someone could post to let us know if things are now working or not.


Mine started to clear up now at 50% still more than 30 zips to upload. Thanks


Thank you. Have passed on to project.
ID: 61011 · Report as offensive     Reply Quote
Albert H.

Send message
Joined: 18 Feb 06
Posts: 73
Credit: 61,796,114
RAC: 46,169
Message 61015 - Posted: 26 Sep 2019, 13:27:21 UTC - in response to Message 61006.  

all cam 25 ...832 are uploaded now.
Thanks
ID: 61015 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 61016 - Posted: 26 Sep 2019, 14:17:34 UTC - in response to Message 61015.  

all cam 25 ...832 are uploaded now.
Thanks

I still have 5 stuck at different % log file at https://pastebin.com/V8FDEkch
ID: 61016 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 61022 - Posted: 26 Sep 2019, 17:37:40 UTC - in response to Message 61007.  
Last modified: 26 Sep 2019, 17:38:33 UTC

Many [edit: ALL] of my CAMs have now cleared. Thanks, all.
ID: 61022 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 61131 - Posted: 1 Oct 2019, 21:48:11 UTC - in response to Message 61022.  

I spoke slightly too soon, it seems. I am running 4 cam25 models on a Windows 7 system. All the zips are uploading correctly and regularly except for a restart.zip which repeatedly gets stuck at 84.5% with transient HTTP error/project servers may be temporarily down. I've had this now for a couple of days.

Many [edit: ALL] of my CAMs have now cleared. Thanks, all.
ID: 61131 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61132 - Posted: 2 Oct 2019, 1:54:50 UTC - in response to Message 61131.  

I've reported this.
ID: 61132 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 61162 - Posted: 3 Oct 2019, 12:55:57 UTC - in response to Message 61131.  

I spoke slightly too soon, it seems. I am running 4 cam25 models on a Windows 7 system. All the zips are uploading correctly and regularly except for a restart.zip which repeatedly gets stuck at 84.5% with transient HTTP error/project servers may be temporarily down. I've had this now for a couple of days.

Many [edit: ALL] of my CAMs have now cleared. Thanks, all.

I had the same issue with 4 CAMs after the fix of upload server. I had few zips stuck for more than 5 days after that and I cancelled all. Two WUs reported as successful and two errored out with upload failure. These were on their 1st attempt, but were not reissued.
ID: 61162 · Report as offensive     Reply Quote
Albert H.

Send message
Joined: 18 Feb 06
Posts: 73
Credit: 61,796,114
RAC: 46,169
Message 61163 - Posted: 3 Oct 2019, 13:11:14 UTC

These do not upload for weeks. Please inform me what to do.


03/10/2019 07:42:47 | climateprediction.net | Started upload of wah2_anz50_n1oq_201612_20_794_011764572_2_r1272938784_out.zip
03/10/2019 07:42:47 | climateprediction.net | Started upload of wah2_cam25_a0cs_200405_18_832_011890827_0_r1194586817_restart.zip
03/10/2019 07:42:48 | climateprediction.net | Temporarily failed upload of wah2_anz50_n1oq_201612_20_794_011764572_2_r1272938784_out.zip: transient HTTP error
03/10/2019 07:42:49 | climateprediction.net | Backing off 03:38:20 on upload of wah2_anz50_n1oq_201612_20_794_011764572_2_r1272938784_out.zip
03/10/2019 07:42:49 | climateprediction.net | Started upload of wah2_cam25_a0l6_200405_18_832_011891129_0_r587166814_restart.zip
03/10/2019 07:43:10 | climateprediction.net | Temporarily failed upload of wah2_cam25_a0cs_200405_18_832_011890827_0_r1194586817_restart.zip: transient HTTP error
03/10/2019 07:43:10 | climateprediction.net | Backing off 05:27:52 on upload of wah2_cam25_a0cs_200405_18_832_011890827_0_r1194586817_restart.zip
03/10/2019 07:43:10 | climateprediction.net | Started upload of wah2_anz50_n1oq_201612_20_794_011764572_2_r1272938784_16.zip
03/10/2019 07:43:11 | climateprediction.net | Temporarily failed upload of wah2_cam25_a0l6_200405_18_832_011891129_0_r587166814_restart.zip: transient HTTP error
03/10/2019 07:43:11 | climateprediction.net | Backing off 04:05:21 on upload of wah2_cam25_a0l6_200405_18_832_011891129_0_r587166814_restart.zip
03/10/2019 07:43:11 | climateprediction.net | Temporarily failed upload of wah2_anz50_n1oq_201612_20_794_011764572_2_r1272938784_16.zip: transient HTTP error
03/10/2019 07:43:11 | climateprediction.net | Backing off 04:19:35 on upload of wah2_anz50_n1oq_201612_20_794_011764572_2_r1272938784_16.zip


Thanks

Albert
ID: 61163 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 61167 - Posted: 3 Oct 2019, 17:52:09 UTC

I now have an additional, new zip failure:
03/10/2019 18:18:19 | climateprediction.net | Started upload of wah2_cam25_a0js_200405_18_832_011891079_0_r168521463_7.zip
03/10/2019 18:18:21 | climateprediction.net | [error] Error reported by file upload server: [wah2_cam25_a0js_200405_18_832_011891079_0_r168521463_7.zip] locked by file_upload_handler PID=48892
03/10/2019 18:18:21 | climateprediction.net | Temporarily failed upload of wah2_cam25_a0js_200405_18_832_011891079_0_r168521463_7.zip: transient upload error
03/10/2019 18:18:21 | climateprediction.net | Backing off 00:06:34 on upload of wah2_cam25_a0js_200405_18_832_011891079_0_r168521463_7.zip
ID: 61167 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61174 - Posted: 3 Oct 2019, 21:32:31 UTC - in response to Message 61167.  
Last modified: 3 Oct 2019, 21:33:04 UTC

I think the message "locked by file_upload_handler" is when you click update, and it's too soon after the server was last told.

I've sent another message about these cam25s.
ID: 61174 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 61175 - Posted: 4 Oct 2019, 6:25:49 UTC - in response to Message 61174.  

I haven't been clicking update. Just for the record of evidence.
I think the message "locked by file_upload_handler" is when you click update, and it's too soon after the server was last told.

I've sent another message about these cam25s.
ID: 61175 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61176 - Posted: 4 Oct 2019, 7:10:23 UTC - in response to Message 61175.  

There must be at least one more way for that to happen then.

The various project people are starting to have a look at the cam25 problem right now.
ID: 61176 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61185 - Posted: 5 Oct 2019, 2:48:30 UTC

Nothing much happening "upstairs".
Is there any improvement with the zips?
ID: 61185 · Report as offensive     Reply Quote
Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 · Next

Message boards : Number crunching : Upload failures

©2024 cpdn.org