Message boards : Number crunching : transient HTTP error
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
"cam"s (Central America), and "sam"s (South America), all go to a server in Mexico, which has become a bit notorious for having problems with their upload server. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
All my backlog has uploaded. Now to see how long it takes to get credit :) My credit has updated since this morning so I suspect you will have yours now. Some time earlier about 1.5GB of data finished uploading and cleared my backlog. |
Send message Joined: 28 Oct 11 Posts: 15 Credit: 9,899,506 RAC: 9,622 |
Indeed, I have received credit. I'm a happy camper right now :) (although it hasn't shown up on the stats site yet (BOINCStats). I've looked for a permissions check box or something in case of GDPR, but I can't find one.) |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
Looks like there is a problem with upload3:- 05/01/2019 22:47:47 | climateprediction.net | [http] [ID#27] Info: Connection #49 to host upload3.cpdn.org left intact 05/01/2019 22:47:47 | climateprediction.net | [error] Error reported by file upload server: Server is out of disk space 05/01/2019 22:47:47 | climateprediction.net | Temporarily failed upload of wah2_global_e0ry_208812_145_769_011662878_1_r1636291340_125.zip: transient upload error |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Apparently the relevant people know within 10 minutes if there's a server problem, from various alarms. But getting someone to the problem is a different matter. I think this one requires someone on site to move large chunks of data from one server to another. After they find one that's not already full. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
Upload11 has the same problem. I would have thought that since they are "virtual" servers they have been allocated a quota of space on an NAS and that increasing this quota is simply a case of logging on to the server and changing a couple of values. But there again that depends on the support staff having remote access....... |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
And now getting transient upload error on 2 SAFR zips. Project informed. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
It looks as if trickles are getting through and being recorded on the database. I have 3 SAFR moels and a PNW model that have registered them today. Does this help: 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: <html><head> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: <title>504 Gateway Timeout</title> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: </head><body> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: <h1>Gateway Timeout</h1> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: <p>The gateway did not receive a timely response 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: from the upstream server or application.</p> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: <hr> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: <address>Apache/2.4.20 (Unix) Server at upload11.cpdn.org Port 80</address> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Received header from server: </body></html> 07/01/2019 22:51:03 | climateprediction.net | [http] [ID#214] Info: Closing connection 548 07/01/2019 22:51:03 | climateprediction.net | [file_xfer] http op done; retval -184 (transient HTTP error) |
Send message Joined: 26 Mar 18 Posts: 1 Credit: 2,704,020 RAC: 0 |
I have about 15 or 20 tasks across six machines trying to upload. Does this happen often? |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
I have about 15 or 20 tasks across six machines trying to upload. There does seem to be a bit of a problem at the moment because the size of the upload files has increased dramatically. However, it’s not as if the project could not do the maths when releasing a batch and anticipate the storage requirements. So the demand bit I understand but the failure of the supply bit I don’t. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Just to clear up the matter of trickles: These are NOT zip files, and don't appear in the Transfers tab. They are an RPC. Also they have their own upload server, which is at Oxford, which is why they (usually), always appear, even when the data files are stuck. They are the only visible record that an upload has occurred, as there's no list of the more than a dozen zip Upload servers, a lot of which are at Oxford, but also scattered around the planet. |
Send message Joined: 9 Nov 18 Posts: 3 Credit: 1,117,025 RAC: 0 |
I turn the http_debug option on and get the logs as follow: 2019/01/08 11:01:03 | climateprediction.net | Started upload of wah2_safr50_c60f_201212_16_777_011702055_0_r2141728672_5.zip 2019/01/08 11:01:04 | climateprediction.net | [http] [ID#11771] Info: Trying 130.246.191.84... 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Info: Connected to upload11.cpdn.org (130.246.191.84) port 80 (#3802) 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: POST /cgi-bin/file_upload_handler HTTP/1.1 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: Host: upload11.cpdn.org 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.14.2) 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: Accept: */* 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: Accept-Encoding: deflate, gzip 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: Content-Type: application/x-www-form-urlencoded 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: Accept-Language: zh_CN 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: Content-Length: 313 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Sent header to server: 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Info: We are completely uploaded and fine 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: HTTP/1.1 200 OK 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: Date: Tue, 08 Jan 2019 03:01:07 GMT 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: Server: Apache/2.4.20 (Unix) 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: Transfer-Encoding: chunked 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: Content-Type: text/plain 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: 64 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: <data_server_reply> 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: <status>0</status> 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: <file_size>13385728</file_size> 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: </data_server_reply> 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: 0 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Received header from server: 2019/01/08 11:01:05 | climateprediction.net | [http] [ID#11771] Info: Connection #3802 to host upload11.cpdn.org left intact 2019/01/08 11:01:06 | climateprediction.net | [http] HTTP_OP::libcurl_exec(): ca-bundle set 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Info: Found bundle for host upload11.cpdn.org: 0x49d2860 [can pipeline] 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Info: Re-using existing connection! (#3802) with host upload11.cpdn.org 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Info: Connected to upload11.cpdn.org (130.246.191.84) port 80 (#3802) 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: POST /cgi-bin/file_upload_handler HTTP/1.1 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: Host: upload11.cpdn.org 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.14.2) 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: Accept: */* 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: Accept-Encoding: deflate, gzip 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: Content-Type: application/x-www-form-urlencoded 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: Accept-Language: zh_CN 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: Content-Length: 88159713 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: Expect: 100-continue 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Sent header to server: 2019/01/08 11:01:06 | climateprediction.net | [http] [ID#11771] Received header from server: HTTP/1.1 100 Continue 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Info: We are completely uploaded and fine 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: HTTP/1.1 200 OK 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: Date: Tue, 08 Jan 2019 03:01:08 GMT 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: Server: Apache/2.4.20 (Unix) 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: Transfer-Encoding: chunked 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: Content-Type: text/plain 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: 8a 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: <data_server_reply> 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: <status>1</status> 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: <message>EOF on socket read : asked for 262144, got 132524 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: </message> 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: </data_server_reply> 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: 0 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Received header from server: 2019/01/08 11:07:06 | climateprediction.net | [http] [ID#11771] Info: Connection #3802 to host upload11.cpdn.org left intact 2019/01/08 11:07:07 | climateprediction.net | [error] Error reported by file upload server: EOF on socket read : asked for 262144, got 132524 2019/01/08 11:07:07 | climateprediction.net | Temporarily failed upload of wah2_safr50_c60f_201212_16_777_011702055_0_r2141728672_5.zip: transient upload error |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Just to clear up the matter of trickles: Mine, on Linux, look like zip files and they DO appear in the Transfers tab: Mon 07 Jan 2019 06:32:57 PM EST | climateprediction.net | Temporarily failed upload of hadcm3s_x5300_190012_60_771_011668342_2_r376222488_1.zip: transient HTTP error Mon 07 Jan 2019 06:32:57 PM EST | climateprediction.net | Backing off 01:35:41 on upload of hadcm3s_x5300_190012_60_771_011668342_2_r376222488_1.zip Mon 07 Jan 2019 06:32:58 PM EST | | Internet access OK - project servers may be temporarily down. Talking about this work unit: Name hadcm3s_x5300_190012_60_771_011668342_2 Workunit 11668342 It has successfully uploaded the only trickle I am likely to get, but it does not realize it and continually tries to upload it over and over. It has now spent 2:04:22 uploading this file over and over again. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
You're talking about a zip file, not a trickle_up. Two different things. The only place that you can see the trickle_up files is in the climateprediction.net folder on the computer. And even then, you have to turn off Network access in the BOINC manager, and wait until there's a message in the Projects tab, saying: Trickle_up waiting. Then go and look in the folder, where you'll see both Trickle_ups and zip files. (I think that the file type for Trickle_ups is "xml".) And the project people are well aware of the upload problem, and are moving data off the Upload servers to various NASs. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
You're talking about a zip file, not a trickle_up. Then what is that zip file? Until a zip file like that is uploaded, typically after about 24 hours are expended by the application, I get no credits for trickles, but pretty soon after one of those zip files gets uploaded, I do get credit. And in the last year or so, I get only one of those files produced and uploaded, even though the work units continue to run for about a week. They use lots more CPU time, but get no more trickles and no more of these uploads. In 2018, I have received only hadcm3s work units, and some fail and some complete. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
You're talking about a zip file, not a trickle_up. Here, for example, are the first two trickle-related upload files from a current model of mine, and the "restart" and "out" files (which are also Zip files): trickle_up_wah2_sas50_a010_200812_22_778_011703060_0_1546886195.xml trickle_up_wah2_sas50_a010_200812_22_778_011703060_0_1546901393.xml ... wah2_sas50_a010_200812_22_778_011703060_0_r1212817215_1.zip wah2_sas50_a010_200812_22_778_011703060_0_r1212817215_2.zip ... wah2_sas50_a010_200812_22_778_011703060_0_r1212817215_out.zip wah2_sas50_a010_200812_22_778_011703060_0_r1212817215_restart.zip There will be 22 of the pair of trickle-related files (XML and Zip) for that particular model type. As far as I know, the credits are awarded for the "trickle" files - i.e. the XML files - but the science for all current model types is in the trickle-related Zip files and the final restart file. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
And in the last year or so, I get only one of those files produced and uploaded, even though the work units continue to run for about a week. They use lots more CPU time, but get no more trickles and no more of these uploads. In 2018, I have received only hadcm3s work units, and some fail and some complete. There is an issue with the trickle up files for HADCM3S tasks which is one of the many issues being worked on by the project. (There is I believe a plan to rewrite some/all of this code to address this and other issues.) Only the first trickle up shows on the web site. The trickle up messages are still being sent according to my event log but not showing on website and credit not being granted as far as I can see. Fortunately, I am not particularly bothered about credit other than as part of checking things are working. |
Send message Joined: 9 Nov 18 Posts: 3 Credit: 1,117,025 RAC: 0 |
When will the problem be fixed? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The future is a strange place. You'll just have to wait and see. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,006,502 RAC: 21,456 |
And HADCM3S zips also failing at 100%. Was vaguely hoping they were going somewhere else so would get through. Obviously not! Edit:Setting both machines to suspend network activity till tomorrow. |
©2024 cpdn.org