climateprediction.net (CPDN) home page
Thread 'Download errors: Permanent HTTP error (PNW)'

Thread 'Download errors: Permanent HTTP error (PNW)'

Message boards : Number crunching : Download errors: Permanent HTTP error (PNW)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 45204 - Posted: 31 Oct 2012, 18:59:14 UTC - in response to Message 45203.  
Last modified: 31 Oct 2012, 19:00:06 UTC

Jonathan posted about midday UK time: I think I have fixed that download issue.
Backups: Here
ID: 45204 · Report as offensive     Reply Quote
old_user2033

Send message
Joined: 27 Aug 04
Posts: 14
Credit: 763,720
RAC: 0
Message 45221 - Posted: 6 Nov 2012, 16:31:45 UTC

Here we go again?

6/11/2012 17:30:21 | climateprediction.net | Started download of hadam3p_pnw_2unc_1975_1_007271952.zip
6/11/2012 17:30:22 | climateprediction.net | Temporarily failed download of hadam3p_pnw_2unc_1975_1_007271952.zip: transient HTTP error

Seriously...
ID: 45221 · Report as offensive     Reply Quote
[P3D] Crashtest

Send message
Joined: 2 Apr 05
Posts: 16
Credit: 19,190,081
RAC: 10,804
Message 45223 - Posted: 6 Nov 2012, 20:30:53 UTC - in response to Message 45221.  

Ive again download errors:

06.11.2012 21:29:18 | climateprediction.net | Started download of hadam3p_pnw_2xx9_1960_1_007273218.zip
06.11.2012 21:29:18 | climateprediction.net | Started download of atmos_2xx9_1960_1_007273218_0.gz
06.11.2012 21:29:21 | climateprediction.net | Temporarily failed download of hadam3p_pnw_2xx9_1960_1_007273218.zip: transient HTTP error
06.11.2012 21:29:21 | climateprediction.net | Backing off 4 min 48 sec on download of hadam3p_pnw_2xx9_1960_1_007273218.zip
06.11.2012 21:29:21 | climateprediction.net | Temporarily failed download of atmos_2xx9_1960_1_007273218_0.gz: transient HTTP error
06.11.2012 21:29:21 | climateprediction.net | Backing off 6 min 12 sec on download of atmos_2xx9_1960_1_007273218_0.gz
06.11.2012 21:29:22 | climateprediction.net | Started download of so2dms_N96_1960_12_1963_02.gz
06.11.2012 21:29:22 | climateprediction.net | Started download of pnw_2xx9_1960_1_007273218_0.gz
06.11.2012 21:29:23 | climateprediction.net | Temporarily failed download of so2dms_N96_1960_12_1963_02.gz: transient HTTP error
06.11.2012 21:29:23 | climateprediction.net | Backing off 4 min 22 sec on download of so2dms_N96_1960_12_1963_02.gz
06.11.2012 21:29:23 | climateprediction.net | Temporarily failed download of pnw_2xx9_1960_1_007273218_0.gz: transient HTTP error
06.11.2012 21:29:23 | climateprediction.net | Backing off 4 min 39 sec on download of pnw_2xx9_1960_1_007273218_0.gz
06.11.2012 21:29:23 | climateprediction.net | Started download of HadISST_SI_N96_1960_12_1963_01f.gz
06.11.2012 21:29:23 | climateprediction.net | Started download of HadISST_SST_N96_1960_12_1963_01f.gz
06.11.2012 21:29:24 | climateprediction.net | Temporarily failed download of HadISST_SI_N96_1960_12_1963_01f.gz: transient HTTP error
06.11.2012 21:29:24 | climateprediction.net | Backing off 7 min 34 sec on download of HadISST_SI_N96_1960_12_1963_01f.gz
06.11.2012 21:29:24 | climateprediction.net | Temporarily failed download of HadISST_SST_N96_1960_12_1963_01f.gz: transient HTTP error
06.11.2012 21:29:24 | climateprediction.net | Backing off 4 min 58 sec on download of HadISST_SST_N96_1960_12_1963_01f.gz
06.11.2012 21:29:26 | climateprediction.net | Started download of HadISST_SI_N96_1960_12_1963_01f.gz
06.11.2012 21:29:27 | climateprediction.net | Temporarily failed download of HadISST_SI_N96_1960_12_1963_01f.gz: transient HTTP error
06.11.2012 21:29:27 | climateprediction.net | Backing off 8 min 2 sec on download of HadISST_SI_N96_1960_12_1963_01f.gz

ID: 45223 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 45224 - Posted: 6 Nov 2012, 20:45:15 UTC

Depressing, isn't it?
But the only work at the moment are re-sends and re-gens. If people want to try to get these bits of work, then it seems that they should expect to get a repeat of someone else's failed download.
The only thing to do, is to wait a few hours to see if it is just due to a very busy server, and then Abort it if it's still failing. And report the model in question here, so that we can pass on the details.

A direct link to the model's work page would be nice, so that we don't have to go trawling though dozens of computers looking for it.

Personally, I shut down my computers weeks ago when the main work ran out, so I'm saving on power, and don't have the stress of failed downloads.



Backups: Here
ID: 45224 · Report as offensive     Reply Quote
Harri Liljeroos

Send message
Joined: 9 Dec 05
Posts: 116
Credit: 12,547,934
RAC: 2,738
Message 45225 - Posted: 6 Nov 2012, 21:03:14 UTC
Last modified: 6 Nov 2012, 21:04:27 UTC

Hi!

I've got two WU's on my host at home with download problem.

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=7470300
and
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=7470505

Edit: antispam filter prevented me making those as clickable links.
ID: 45225 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,827,799
RAC: 5,038
Message 45226 - Posted: 6 Nov 2012, 23:58:31 UTC - in response to Message 45221.  

Here we go again?

6/11/2012 17:30:21 | climateprediction.net | Started download of hadam3p_pnw_2unc_1975_1_007271952.zip
6/11/2012 17:30:22 | climateprediction.net | Temporarily failed download of hadam3p_pnw_2unc_1975_1_007271952.zip: transient HTTP error

Seriously...
Not quite 'again'. That work unit is marked 'No Resubmission' but it has evidently been re-submitted (because you got it, or bits of it at least). That's a server-side error, which is why the files are missing: the files shouldn't be there because the work unit was marked finished long ago.

If you get a model from a work unit marked 'No Resubmission' then abort it. It is of no use even if it starts. (Project staff are investigating why these work units are being re-submitted.)
ID: 45226 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 45227 - Posted: 7 Nov 2012, 2:56:45 UTC - in response to Message 45226.  

Here we go again?

6/11/2012 17:30:21 | climateprediction.net | Started download of hadam3p_pnw_2unc_1975_1_007271952.zip
6/11/2012 17:30:22 | climateprediction.net | Temporarily failed download of hadam3p_pnw_2unc_1975_1_007271952.zip: transient HTTP error

Seriously...
Not quite 'again'. That work unit is marked 'No Resubmission' but it has evidently been re-submitted (because you got it, or bits of it at least). That's a server-side error, which is why the files are missing: the files shouldn't be there because the work unit was marked finished long ago.

If you get a model from a work unit marked 'No Resubmission' then abort it. It is of no use even if it starts. (Project staff are investigating why these work units are being re-submitted.)


Thanks for the information. I too picked up a couple of the "no resubmission" wu's that can't download

Just blow them away - yeah OK.
ID: 45227 · Report as offensive     Reply Quote
[P3D] Crashtest

Send message
Joined: 2 Apr 05
Posts: 16
Credit: 19,190,081
RAC: 10,804
Message 45228 - Posted: 7 Nov 2012, 6:20:23 UTC

my download problem wu:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=7470881
ID: 45228 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 45229 - Posted: 7 Nov 2012, 8:39:03 UTC

Something is causing the server to generate reissue tasks exactly 12,500 hours after a workunit was marked as "No Resubmission". Other projects aren't affected by this because they don't have the data retention requirements of CPDN.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 45229 · Report as offensive     Reply Quote
[P3D] Crashtest

Send message
Joined: 2 Apr 05
Posts: 16
Credit: 19,190,081
RAC: 10,804
Message 45231 - Posted: 8 Nov 2012, 17:42:39 UTC - in response to Message 45229.  

I aborted my problem WU and about 5sec later the WU was sent to an other User ....

So one should check the server;

by the way - we need more workin WU !!!!!!
ID: 45231 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 45232 - Posted: 8 Nov 2012, 20:48:33 UTC

It appears that the re-issue problem is to do with part of the BOINC server code, which will need an upgrade to cure.
As people who have been with the project for a long time will know, this is a serious undertaking, so a lot of planning is planned first.

It may be that one of the first things to do is to not issue any large batch of datasets until after the upgrade.

And there's no need to post about any further download hangups.


Backups: Here
ID: 45232 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,827,799
RAC: 5,038
Message 45234 - Posted: 9 Nov 2012, 1:22:37 UTC - in response to Message 45233.  

An explicit, timely warning of problems likely to waste resources is not too much to expect.
There have been upload and download problems for a while but they do not all have the same cause.

Most of the upload and download errors reported here have been caused by networking or hardware problems at the server - and these have been under a continual fix and replacement process, which has been extensively discussed (disk failures, URL changes etc.). When successfully downloaded these models are perfectly valid and not a waste of resources.

The cause of the 'No Resubmission' problem, which affects ancient models that do not need to be run again, was identified in the last few days and reported here a few hours ago by Les: that's not bad ...

No-one should abort any model, however troubled its download history, unless the work unit is marked 'No Resubmission'.
ID: 45234 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 45243 - Posted: 11 Nov 2012, 3:41:48 UTC - in response to Message 45234.  
Last modified: 11 Nov 2012, 3:49:09 UTC

An explicit, timely warning of problems likely to waste resources is not too much to expect.
There have been upload and download problems for a while but they do not all have the same cause.

Most of the upload and download errors reported here have been caused by networking or hardware problems at the server - and these have been under a continual fix and replacement process, which has been extensively discussed (disk failures, URL changes etc.). When successfully downloaded these models are perfectly valid and not a waste of resources.

The cause of the 'No Resubmission' problem, which affects ancient models that do not need to be run again, was identified in the last few days and reported here a few hours ago by Les: that's not bad ...

No-one should abort any model, however troubled its download history, unless the work unit is marked 'No Resubmission'.


But -- I kill this one click for details Computer Sent Time reported
or deadline
explain Status Run time
(sec) CPU time
(sec) Claimed credit Granted credit Application
15431904 850603 8 Nov 2012 23:57:01 UTC 22 Oct 2013 5:17:01 UTC In progress --- --- --- --- UK Met Office HADAM3P European Region v6.09

and I really don't care how slobby my quotes are - it's a not no-resubmit- but a damaged wu that doesn't fit the "no resuwubmit" definition - but I kill it anyway because a broken wu is a waste.

Borger slopper please help the crew with spelling.

Sooner or later? Please ? Speciaoised spell-checker for WU's ?

Splorwizz;e nongreep" --

It would be better if, after the fixup, that there would be some kind of spell-checker for damaged WU's from the spell-checkers of the dependent WU - submitters.

Not bitching , just asking

I think I understand the situation -- I'll dedicate all my compute as soon as it's fixed -- take care - hope it works out well.

And please -- note my so far significant contribution -- and I read this board every week or so.

And have Patience!! It is one of the major virtues.
ID: 45243 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,827,799
RAC: 5,038
Message 45245 - Posted: 11 Nov 2012, 11:56:05 UTC - in response to Message 45243.  

... It would be better if, after the fixup, that there would be some kind of spell-checker for damaged WU's from the spell-checkers of the dependent WU - submitters. ...
There have been a few examples of spelling errors in WU definitions and a pre-release sanity check would be a very good idea as it could check not just the validity of URLs but also the presence of a complete set of files for download.

However, there is also the Linux 'handler' etc. problem. Traces have demonstrated, to the project staff's satisfaction, that the WU details leave Oxford intact. They appear to get corrupted on the client machine. No-one has ever got to the bottom of that; perhaps no-one has looked.

I'm not sure which one of those options you've been hit by. If the first, then that needs to be fed back so that a correction can be made. If the second then I can quite understand the frustration at editing XML files.

... or perhaps I've completely mis-read your post.
ID: 45245 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 45246 - Posted: 11 Nov 2012, 12:19:13 UTC - in response to Message 45245.  

... It would be better if, after the fixup, that there would be some kind of spell-checker for damaged WU's from the spell-checkers of the dependent WU - submitters. ...
There have been a few examples of spelling errors in WU definitions and a pre-release sanity check would be a very good idea as it could check not just the validity of URLs but also the presence of a complete set of files for download.

However, there is also the Linux 'handler' etc. problem. Traces have demonstrated, to the project staff's satisfaction, that the WU details leave Oxford intact. They appear to get corrupted on the client machine. No-one has ever got to the bottom of that; perhaps no-one has looked.

I'm not sure which one of those options you've been hit by. If the first, then that needs to be fed back so that a correction can be made. If the second then I can quite understand the frustration at editing XML files.

... or perhaps I've completely mis-read your post.


I think you understand some of the problems. The Linux hnddler/handler thing seems to be hardware-related and really really obscure and I'm not blaming anyone for not figuring it out. I've looked at it a few times and it is for sure a puzzler.

Some kind of pre-submission filter or checker might be worthwhile - especially because the CPDN WU's can run so long - even if a batch has an obvious configuration or even spelling problem, that can waste a day or two's worth of up-and-downloads or a few thousand WUs worth of bandwidth. And annoyance to us long-time contributors.

But -- I expect to keep on crunching - annoyances aside - at least until my current crop of cpu's fails or goes obsolescent.

Thanks for the support and also thanks to the many who contribute cpu time here -
I think it is worth the occasional difficulties.

E
ID: 45246 · Report as offensive     Reply Quote
STE\/E

Send message
Joined: 15 Aug 04
Posts: 57
Credit: 10,360,323
RAC: 1,102
Message 45781 - Posted: 30 Mar 2013, 13:46:50 UTC

I get a PNW Wu every once in awhile but Permanent HTTP Error still rears it's ugly head ???

Apparently 1 UK Met Office HADAM3P Pacific North West Wu did make it thru as I show Hr;s run for it at the WUProp Project ...
ID: 45781 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 45782 - Posted: 30 Mar 2013, 17:29:48 UTC - in response to Message 45781.  
Last modified: 30 Mar 2013, 17:38:57 UTC

I get a PNW Wu every once in awhile but Permanent HTTP Error still rears it's ugly head ???

Apparently 1 UK Met Office HADAM3P Pacific North West Wu did make it thru as I show Hr;s run for it at the WUProp Project ...

See Les' post here and Mo.V's post immediately following:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7527&nowrap=true#45539

There are numerous other posts on the topic on the boards. It's one of boinc's/CPDN's gifts that keeps on giving (if I might be forgiven the cliche).

Edit: To all who celebrate the day, Happy Easter.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 45782 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 46435 - Posted: 17 Jun 2013, 16:26:11 UTC

Just got another of these PNW tasks, (a re-issue) where some of the files didn't download and a permanent HTTP error ensued. What I can't remember is whether or not I need to delete the folder with the model or whether BOINC will do it given time?
ID: 46435 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 5 Aug 04
Posts: 127
Credit: 24,498,085
RAC: 21,454
Message 46441 - Posted: 18 Jun 2013, 11:14:48 UTC - in response to Message 46435.  

Just got another of these PNW tasks, (a re-issue) where some of the files didn't download and a permanent HTTP error ensued. What I can't remember is whether or not I need to delete the folder with the model or whether BOINC will do it given time?

Having extra CPDN-folders is only a problem after a model has started. As long as one or more of the input-files is missing, the model never starts, and BOINC-client cleans-up on it's own.

Note, some of the input-files can be marked as "sticky"-files, in case they're used by multiple models. "Sticky"-files is not automatically deleted. Manually deleting such files won't work either, they'll just be tried re-downloaded next time client re-starts. Depending on client-version, they won't be removed by a reset either, but if not mistaken they will be removed on reset if you're running a fairly resent v7-client.
ID: 46441 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 46442 - Posted: 18 Jun 2013, 11:22:56 UTC

Thanks, just seen that unit is reported as finished and gone.
ID: 46442 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : Number crunching : Download errors: Permanent HTTP error (PNW)

©2024 cpdn.org