Message boards : Number crunching : failed upload: can't resolve hostname
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,718,239 RAC: 8,054 |
This is a case where enabling BOINC's http_debug logging may provide more details about the nature of that "transient upload error". |
Send message Joined: 4 Sep 13 Posts: 9 Credit: 672,309 RAC: 0 |
I did not get the transient upload problem solved and decided to delete the corresponding units. Editing the hosts file did not solve the transient upload problem a couple of units had. |
Send message Joined: 16 Aug 04 Posts: 156 Credit: 9,035,872 RAC: 2,928 |
I guess this is apid-wattch thing is a linux problem. Saw the same thing back in february in cpdnbeta when attaching a new box with 3.5 linux kernel, upload servers stanford and cpdnbeta got misspelled as staanford and pdnbeta in client-state.xml, also got apid in classic. Got tired of it and reinstalled old ubuntu10 with 2.6 kernel and all was fine. Then in july tried lubuntu 13.04 with 3.8 kernel and it was 23% faster! in hadcmn.6.07 but with the same apid crap, then tried to edit the client-state.xml to rapid-watch on all four lines for every task and it works :-) Me thought the speedup was due to the light desktop in lubuntu but then newly installed linuxmint-15 and it's the same nice speed, yuipee! On par with windows now? This is with boinc 6.4-6.10.58, higher got problems with libraries, don't like symbolic links and i am lazy, heh Edit: this is only one box i sit and edit the client-state.xml in and don't bother much but it surely should be fixed if others have problems too |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Strange, I haven't had any problems since upgrading to the latest Ubuntu with the 3.8 kernel. All uploads have gone through normally. I have noticed faster boot times with it but haven't been paying attention to the speed of crunching. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Too late to edit - my memory is that this only happened with a particular lot of tasks which probably explains my not having suffered this time around. |
Send message Joined: 4 Sep 13 Posts: 9 Credit: 672,309 RAC: 0 |
Here we go again with the transient upload error... I can't really upgrade my kernel due to some other incompatibilities, but I am open for suggestions. Otherwise I guess I have go through my working units every now and then and delete the problematic ones. No need to upload 50MB if it does not go through anyway... :s 10-Oct-2013 12:37:18 [climateprediction.net] [fxd] starting upload, upload_offset -1 10-Oct-2013 12:37:18 [climateprediction.net] Started upload of hadcm3n_3bqz_2020_40_008389544_3_1.zip 10-Oct-2013 12:37:18 [climateprediction.net] [file_xfer_debug] URL: http://apid-wattch.badc.rl.ac.uk/cpdn_cgi/file_upload_handler 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Info: About to connect() to apid-wattch.badc.rl.ac.uk port 80 (#0) 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Info: Trying 130.246.191.84... 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Info: Connected to apid-wattch.badc.rl.ac.uk (130.246.191.84) port 80 (#0) 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: POST /cpdn_cgi/file_upload_handler HTTP/1.1 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: User-Agent: BOINC client (x86_64-pc-linux-gnu 6.10.58) 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: Host: apid-wattch.badc.rl.ac.uk 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: Accept: */* 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: Accept-Encoding: deflate, gzip 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: Content-Type: application/x-www-form-urlencoded 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: Content-Length: 292 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Sent header to server: 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Received header from server: HTTP/1.1 200 OK 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Received header from server: Date: Thu, 10 Oct 2013 16:37:20 GMT 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Received header from server: Server: Apache/2.2.12 (Linux/SUSE) 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Received header from server: Transfer-Encoding: chunked 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Received header from server: Content-Type: text/plain 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Received header from server: 10-Oct-2013 12:37:20 [---] [http_debug] [ID#2] Info: Connection #0 to host apid-wattch.badc.rl.ac.uk left intact 10-Oct-2013 12:37:21 [climateprediction.net] [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0 10-Oct-2013 12:37:21 [climateprediction.net] [file_xfer_debug] parsing upload response: <data_server_reply> <status>0</status> <file_size>0</file_size> </data_server_reply> 10-Oct-2013 12:37:21 [climateprediction.net] [file_xfer_debug] parsing status: 0 10-Oct-2013 12:37:21 [climateprediction.net] [fxd] starting upload, upload_offset 0 10-Oct-2013 12:37:23 [---] [http_debug] [ID#2] Info: Re-using existing connection! (#0) with host apid-wattch.badc.rl.ac.uk 10-Oct-2013 12:37:23 [---] [http_debug] [ID#2] Info: Connected to apid-wattch.badc.rl.ac.uk (130.246.191.84) port 80 (#0) 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: POST /cpdn_cgi/file_upload_handler HTTP/1.1 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: User-Agent: BOINC client (x86_64-pc-linux-gnu 6.10.58) 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: Host: apid-wattch.badc.rl.ac.uk 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: Accept: */* 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: Accept-Encoding: deflate, gzip 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: Content-Type: application/x-www-form-urlencoded 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: Content-Length: 54369772 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: Expect: 100-continue 10-Oct-2013 12:37:25 [---] [http_debug] [ID#2] Sent header to server: 10-Oct-2013 12:37:26 [---] [http_debug] [ID#2] Received header from server: HTTP/1.1 100 Continue 10-Oct-2013 12:37:26 [---] [http_debug] [ID#2] Info: Expire cleared 10-Oct-2013 12:43:33 [---] [http_debug] [ID#2] Received header from server: HTTP/1.1 200 OK 10-Oct-2013 12:43:33 [---] [http_debug] [ID#2] Received header from server: Date: Thu, 10 Oct 2013 16:37:25 GMT 10-Oct-2013 12:43:33 [---] [http_debug] [ID#2] Received header from server: Server: Apache/2.2.12 (Linux/SUSE) 10-Oct-2013 12:43:33 [---] [http_debug] [ID#2] Received header from server: Transfer-Encoding: chunked 10-Oct-2013 12:43:33 [---] [http_debug] [ID#2] Received header from server: Content-Type: text/plain 10-Oct-2013 12:43:33 [---] [http_debug] [ID#2] Received header from server: 10-Oct-2013 12:43:33 [---] [http_debug] [ID#2] Info: Connection #0 to host apid-wattch.badc.rl.ac.uk left intact 10-Oct-2013 12:43:34 [climateprediction.net] [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0 10-Oct-2013 12:43:34 [climateprediction.net] [file_xfer_debug] parsing upload response: <data_server_reply> <status>0</status> </data_server_reply> 10-Oct-2013 12:43:34 [climateprediction.net] [file_xfer_debug] parsing status: -127 10-Oct-2013 12:43:34 [climateprediction.net] [file_xfer_debug] file transfer status -127 10-Oct-2013 12:43:34 [climateprediction.net] Temporarily failed upload of hadcm3n_3bqz_2020_40_008389544_3_1.zip: transient upload error 10-Oct-2013 12:43:34 [climateprediction.net] Backing off 3 hr 32 min 42 sec on upload of hadcm3n_3bqz_2020_40_008389544_3_1.zip |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,718,239 RAC: 8,054 |
It would be just as quick and easy - with BOINC shut down - to do a global search-and-replace on client_state.xml: Find http://apid-wattch.badc.rl.ac.uk Replace with http://rapid-watch.badc.rl.ac.uk |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Bernardinho Your problem is the well known BOINC/Linux problem, where the data is correct when it leaves Oxford, but the urls get corrupted when they get to certain computers. There are 2 solutions: 1) Don't run this project on a Linux computer. 2) Shut down BOINC, make a copy of client_state.xml and put it in a safe place, and then edit the original file with a plain text editor to correct the faulty bits. (As per Richard's post below this.) Commiserations. |
Send message Joined: 16 Aug 04 Posts: 156 Credit: 9,035,872 RAC: 2,928 |
Linux works fine, skip Les :-) |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I know. PROVIDED you're prepared to patch the urls in each new download. And it may only happen to 64 bit systems. My 32 bit box is OK. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Or only some 64bit systems. So far I seem to have escaped this one. I have installed by unpacking the tar.gz file into it's own directory as opposed to using the one provided by kubuntu's package manager. Don't know how relevant or not that is. |
Send message Joined: 16 Aug 04 Posts: 156 Credit: 9,035,872 RAC: 2,928 |
Would be great to hear if any windows users get this "apid-wattch" thing too |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,827,799 RAC: 5,038 |
Would be great to hear if any windows users get this "apid-wattch" thing too There have been occasions when an address has been entered wrongly into the system and that has propagated to all users, but the corruption of an apparently correct address during delivery is restricted to Linux as far as I recall. |
Send message Joined: 4 Sep 13 Posts: 9 Credit: 672,309 RAC: 0 |
I have the feeling that the transient upload error is not necessarily related to the wrong host name, it seems to be another problem. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
bernardinho In your case, it IS the wrong host name. (Unless you've edited the client_state.xml file since you posted that list. In which case, you now have a different problem.) Look near the top of the list that you posted earlier in this thread. You won't get anywhere sending it to that url. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
cwhyl This was discussed extensively on the old php board 2-3 years back when it first started happening. It was also tested a fair bit. The files were/are OK on the server. They're OK when they arrive zipped up on the user's computer. At some point after unzipping and moving to their various locations, the data in the client_state.xml file shows up corrupted, in a couple of different ways. So it's most likely a subtle bug in BOINC for a particular variety of Linux. |
Send message Joined: 16 Aug 04 Posts: 156 Credit: 9,035,872 RAC: 2,928 |
What? You knew about this all the time without saying anything? Fix a new moderator please. |
Send message Joined: 3 Sep 04 Posts: 126 Credit: 26,610,380 RAC: 3,377 |
Or reopen the php board for reading only. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
cwhyl What? The discussion on the php board was in a section open to the public, and numerous crunchers DID post there about it. The person who tested the download on their Linux computer was a non moderator. It was also posted about on this board, with links back to the php board for the detailed instructions on how to fix the problem. You've been on this project for long enough to have known about the other board. Anyone who joins a volunteer organisation and doesn't check the notice board each time they turn up at the meeting place to find out what's going on has only them self to blame for their ignorance. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Alex The php board is NEVER coming back, in any form whatsoever. See the posts in the News and Announcements thread around the 22 March 2013 about why. |
©2024 cpdn.org