climateprediction.net (CPDN) home page
Thread 'failed upload: can't resolve hostname'

Thread 'failed upload: can't resolve hostname'

Message boards : Number crunching : failed upload: can't resolve hostname
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Ingleside

Send message
Joined: 5 Aug 04
Posts: 126
Credit: 24,436,789
RAC: 23,704
Message 47298 - Posted: 12 Oct 2013, 21:37:46 UTC - in response to Message 47279.  

cwhyl

This was discussed extensively on the old php board 2-3 years back when it first started happening. It was also tested a fair bit.

The files were/are OK on the server.
They're OK when they arrive zipped up on the user's computer.
At some point after unzipping and moving to their various locations, the data in the client_state.xml file shows up corrupted, in a couple of different ways.

So it's most likely a subtle bug in BOINC for a particular variety of Linux.

Uhm, maybe my recollection is too fuzzy, but I don't remember anyone with a corrupt upload-URL ever showing they did get a sched_reply_climateprediction.net.xml with the correct upload-URL and this was either wrongly inserted into client_state.xml or client_state.xml later getting corrupted.

Since CPDN doesn't try uploading before having trickled N times, sched_reply* has also been wiped-clean atleast N times. This is one of the reasons trying to pin-point why some is getting corrupt URL is so hard, and also why AFAIK server-problems as the source never was eliminiated.

ID: 47298 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 47299 - Posted: 12 Oct 2013, 21:53:06 UTC - in response to Message 47298.  

It WAS tested and the results posted in that extinct thread.
I think that the person in question unpacked the zip/tar them self on arrival and before it had a chance to start, and manually checked the url.

Someday I may have the time, patience, and spare computer to set up a VM, then setup either a WAMP or LAMP as appropriate, so that I can safely run one of the copies of the php board that I made.
Then I can look for the details of this uploader problem.

But that won't change the current requirement for manual editing until some one with LOTS of time repeats the process of manual unpacking, checking, and debugging.

ID: 47299 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 5 Aug 04
Posts: 126
Credit: 24,436,789
RAC: 23,704
Message 47300 - Posted: 12 Oct 2013, 22:14:58 UTC - in response to Message 47299.  

It WAS tested and the results posted in that extinct thread.
I think that the person in question unpacked the zip/tar them self on arrival and before it had a chance to start, and manually checked the url.

Hmm, why someone would zip or tar (and feather) their sched_reply_climateprediction.net.xml escapes me, and if it was done to any of the many CPDN-files residing in the project-directory makes even lesser sence since the BOINC-client doesn't know (and doesn't care) if any of these files somehow does include an url.

But while my recollection was too fuzzy, it's an advantage I did take part in atleast one of the discussions myself and this was not done on the php-board.

This message from 12.04.2011 is the most interesting, clearly showing the client_state.xml was corrupt even before any of the files was downloaded while sched_reply* was not corrupt.

If the problem was CPDN-only on the other hand was never answered by the tester in the old thread...
ID: 47300 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 47301 - Posted: 12 Oct 2013, 22:28:08 UTC - in response to Message 47300.  

OK, my mistake. I've just looked at the download messages for some data sets, and the files are zip's and gz's. At least on Windows. Not going to bother with the Linux machine.
And one of them contains the url's for uploading. As well as instructions for what file gets saved where.

ID: 47301 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,808,726
RAC: 5,192
Message 47303 - Posted: 12 Oct 2013, 23:23:11 UTC

My memory is that the "corruption during delivery" explanation arose from a comment by one of the project staff, who reported that the data left Oxford OK -i.e. the corruption is en route or at the client. Progress has therefore stalled as everyone is saying, "Not me, guv".

It does look awfully like some buffer misalignment somewhere.
ID: 47303 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 47307 - Posted: 13 Oct 2013, 3:53:54 UTC

bernardinho,

From your results I see that your running BOINC 6.10.58 and tracking back through the BOINC checkin notes that version was tagged on 1st July 2010.

BOINC Trac ticket #1033 was opened on 25th November 2011 against 6.10.58 for potentially unsafe copying of information into client_state.xml from scheduler replies. The problem had already been fixed by the checkin of lib/str_util.cpp on 8th February 2011, so the CPDN upload URL corruption shouldn't happen from BOINC version 6.12.15 onwards.

The problem could be addressed for pre 6.12.15 BOINC clients if the CPDN workunit templates were modified to remove the spaces around the upload URLs. For example, the <file_info> blocks for HadCM3N upload files would have

<url>http://rapid-watch.badc.rl.ac.uk/cpdn_cgi/file_upload_handler</url>

instead of

<url> http://rapid-watch.badc.rl.ac.uk/cpdn_cgi/file_upload_handler </url>

Making that change won't solve the problem for workunits which are already in the queue. That can only be done by a manual edit of client_state.xml when BOINC isn't running.

The third comment in the Trac ticket explains why the corruption is only happening on some Linux distributions:

One must not have the source and destination strings be (even part of) the same string. Whether the strcpy will work, do nothing, or just behave badly depends on the endianness and implementation of strcpy on the target machine.

"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 47307 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : Number crunching : failed upload: can't resolve hostname

©2024 cpdn.org