Message boards : Number crunching : Download errors: Permanent HTTP error (PNW)
Message board moderation
Previous · 1 · 2 · 3
Author | Message |
---|---|
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Jonathan posted about midday UK time: I think I have fixed that download issue. Backups: Here |
Send message Joined: 27 Aug 04 Posts: 14 Credit: 763,720 RAC: 0 |
Here we go again? 6/11/2012 17:30:21 | climateprediction.net | Started download of hadam3p_pnw_2unc_1975_1_007271952.zip 6/11/2012 17:30:22 | climateprediction.net | Temporarily failed download of hadam3p_pnw_2unc_1975_1_007271952.zip: transient HTTP error Seriously... |
Send message Joined: 2 Apr 05 Posts: 16 Credit: 19,190,081 RAC: 10,804 |
Ive again download errors: 06.11.2012 21:29:18 | climateprediction.net | Started download of hadam3p_pnw_2xx9_1960_1_007273218.zip 06.11.2012 21:29:18 | climateprediction.net | Started download of atmos_2xx9_1960_1_007273218_0.gz 06.11.2012 21:29:21 | climateprediction.net | Temporarily failed download of hadam3p_pnw_2xx9_1960_1_007273218.zip: transient HTTP error 06.11.2012 21:29:21 | climateprediction.net | Backing off 4 min 48 sec on download of hadam3p_pnw_2xx9_1960_1_007273218.zip 06.11.2012 21:29:21 | climateprediction.net | Temporarily failed download of atmos_2xx9_1960_1_007273218_0.gz: transient HTTP error 06.11.2012 21:29:21 | climateprediction.net | Backing off 6 min 12 sec on download of atmos_2xx9_1960_1_007273218_0.gz 06.11.2012 21:29:22 | climateprediction.net | Started download of so2dms_N96_1960_12_1963_02.gz 06.11.2012 21:29:22 | climateprediction.net | Started download of pnw_2xx9_1960_1_007273218_0.gz 06.11.2012 21:29:23 | climateprediction.net | Temporarily failed download of so2dms_N96_1960_12_1963_02.gz: transient HTTP error 06.11.2012 21:29:23 | climateprediction.net | Backing off 4 min 22 sec on download of so2dms_N96_1960_12_1963_02.gz 06.11.2012 21:29:23 | climateprediction.net | Temporarily failed download of pnw_2xx9_1960_1_007273218_0.gz: transient HTTP error 06.11.2012 21:29:23 | climateprediction.net | Backing off 4 min 39 sec on download of pnw_2xx9_1960_1_007273218_0.gz 06.11.2012 21:29:23 | climateprediction.net | Started download of HadISST_SI_N96_1960_12_1963_01f.gz 06.11.2012 21:29:23 | climateprediction.net | Started download of HadISST_SST_N96_1960_12_1963_01f.gz 06.11.2012 21:29:24 | climateprediction.net | Temporarily failed download of HadISST_SI_N96_1960_12_1963_01f.gz: transient HTTP error 06.11.2012 21:29:24 | climateprediction.net | Backing off 7 min 34 sec on download of HadISST_SI_N96_1960_12_1963_01f.gz 06.11.2012 21:29:24 | climateprediction.net | Temporarily failed download of HadISST_SST_N96_1960_12_1963_01f.gz: transient HTTP error 06.11.2012 21:29:24 | climateprediction.net | Backing off 4 min 58 sec on download of HadISST_SST_N96_1960_12_1963_01f.gz 06.11.2012 21:29:26 | climateprediction.net | Started download of HadISST_SI_N96_1960_12_1963_01f.gz 06.11.2012 21:29:27 | climateprediction.net | Temporarily failed download of HadISST_SI_N96_1960_12_1963_01f.gz: transient HTTP error 06.11.2012 21:29:27 | climateprediction.net | Backing off 8 min 2 sec on download of HadISST_SI_N96_1960_12_1963_01f.gz |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Depressing, isn't it? But the only work at the moment are re-sends and re-gens. If people want to try to get these bits of work, then it seems that they should expect to get a repeat of someone else's failed download. The only thing to do, is to wait a few hours to see if it is just due to a very busy server, and then Abort it if it's still failing. And report the model in question here, so that we can pass on the details. A direct link to the model's work page would be nice, so that we don't have to go trawling though dozens of computers looking for it. Personally, I shut down my computers weeks ago when the main work ran out, so I'm saving on power, and don't have the stress of failed downloads. Backups: Here |
Send message Joined: 9 Dec 05 Posts: 116 Credit: 12,547,934 RAC: 2,738 |
Hi! I've got two WU's on my host at home with download problem. http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=7470300 and http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=7470505 Edit: antispam filter prevented me making those as clickable links. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,827,799 RAC: 5,038 |
Here we go again?Not quite 'again'. That work unit is marked 'No Resubmission' but it has evidently been re-submitted (because you got it, or bits of it at least). That's a server-side error, which is why the files are missing: the files shouldn't be there because the work unit was marked finished long ago. If you get a model from a work unit marked 'No Resubmission' then abort it. It is of no use even if it starts. (Project staff are investigating why these work units are being re-submitted.) |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
Here we go again?Not quite 'again'. That work unit is marked 'No Resubmission' but it has evidently been re-submitted (because you got it, or bits of it at least). That's a server-side error, which is why the files are missing: the files shouldn't be there because the work unit was marked finished long ago. Thanks for the information. I too picked up a couple of the "no resubmission" wu's that can't download Just blow them away - yeah OK. |
Send message Joined: 2 Apr 05 Posts: 16 Credit: 19,190,081 RAC: 10,804 |
my download problem wu: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=7470881 |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
Something is causing the server to generate reissue tasks exactly 12,500 hours after a workunit was marked as "No Resubmission". Other projects aren't affected by this because they don't have the data retention requirements of CPDN. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 2 Apr 05 Posts: 16 Credit: 19,190,081 RAC: 10,804 |
I aborted my problem WU and about 5sec later the WU was sent to an other User .... So one should check the server; by the way - we need more workin WU !!!!!! |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It appears that the re-issue problem is to do with part of the BOINC server code, which will need an upgrade to cure. As people who have been with the project for a long time will know, this is a serious undertaking, so a lot of planning is planned first. It may be that one of the first things to do is to not issue any large batch of datasets until after the upgrade. And there's no need to post about any further download hangups. Backups: Here |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,827,799 RAC: 5,038 |
An explicit, timely warning of problems likely to waste resources is not too much to expect.There have been upload and download problems for a while but they do not all have the same cause. Most of the upload and download errors reported here have been caused by networking or hardware problems at the server - and these have been under a continual fix and replacement process, which has been extensively discussed (disk failures, URL changes etc.). When successfully downloaded these models are perfectly valid and not a waste of resources. The cause of the 'No Resubmission' problem, which affects ancient models that do not need to be run again, was identified in the last few days and reported here a few hours ago by Les: that's not bad ... No-one should abort any model, however troubled its download history, unless the work unit is marked 'No Resubmission'. |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
An explicit, timely warning of problems likely to waste resources is not too much to expect.There have been upload and download problems for a while but they do not all have the same cause. But -- I kill this one click for details Computer Sent Time reported or deadline explain Status Run time (sec) CPU time (sec) Claimed credit Granted credit Application 15431904 850603 8 Nov 2012 23:57:01 UTC 22 Oct 2013 5:17:01 UTC In progress --- --- --- --- UK Met Office HADAM3P European Region v6.09 and I really don't care how slobby my quotes are - it's a not no-resubmit- but a damaged wu that doesn't fit the "no resuwubmit" definition - but I kill it anyway because a broken wu is a waste. Borger slopper please help the crew with spelling. Sooner or later? Please ? Speciaoised spell-checker for WU's ? Splorwizz;e nongreep" -- It would be better if, after the fixup, that there would be some kind of spell-checker for damaged WU's from the spell-checkers of the dependent WU - submitters. Not bitching , just asking I think I understand the situation -- I'll dedicate all my compute as soon as it's fixed -- take care - hope it works out well. And please -- note my so far significant contribution -- and I read this board every week or so. And have Patience!! It is one of the major virtues. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,827,799 RAC: 5,038 |
... It would be better if, after the fixup, that there would be some kind of spell-checker for damaged WU's from the spell-checkers of the dependent WU - submitters. ...There have been a few examples of spelling errors in WU definitions and a pre-release sanity check would be a very good idea as it could check not just the validity of URLs but also the presence of a complete set of files for download. However, there is also the Linux 'handler' etc. problem. Traces have demonstrated, to the project staff's satisfaction, that the WU details leave Oxford intact. They appear to get corrupted on the client machine. No-one has ever got to the bottom of that; perhaps no-one has looked. I'm not sure which one of those options you've been hit by. If the first, then that needs to be fed back so that a correction can be made. If the second then I can quite understand the frustration at editing XML files. ... or perhaps I've completely mis-read your post. |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
... It would be better if, after the fixup, that there would be some kind of spell-checker for damaged WU's from the spell-checkers of the dependent WU - submitters. ...There have been a few examples of spelling errors in WU definitions and a pre-release sanity check would be a very good idea as it could check not just the validity of URLs but also the presence of a complete set of files for download. I think you understand some of the problems. The Linux hnddler/handler thing seems to be hardware-related and really really obscure and I'm not blaming anyone for not figuring it out. I've looked at it a few times and it is for sure a puzzler. Some kind of pre-submission filter or checker might be worthwhile - especially because the CPDN WU's can run so long - even if a batch has an obvious configuration or even spelling problem, that can waste a day or two's worth of up-and-downloads or a few thousand WUs worth of bandwidth. And annoyance to us long-time contributors. But -- I expect to keep on crunching - annoyances aside - at least until my current crop of cpu's fails or goes obsolescent. Thanks for the support and also thanks to the many who contribute cpu time here - I think it is worth the occasional difficulties. E |
Send message Joined: 15 Aug 04 Posts: 57 Credit: 10,360,323 RAC: 1,102 |
I get a PNW Wu every once in awhile but Permanent HTTP Error still rears it's ugly head ??? Apparently 1 UK Met Office HADAM3P Pacific North West Wu did make it thru as I show Hr;s run for it at the WUProp Project ... |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
I get a PNW Wu every once in awhile but Permanent HTTP Error still rears it's ugly head ??? See Les' post here and Mo.V's post immediately following: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7527&nowrap=true#45539 There are numerous other posts on the topic on the boards. It's one of boinc's/CPDN's gifts that keeps on giving (if I might be forgiven the cliche). Edit: To all who celebrate the day, Happy Easter. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Just got another of these PNW tasks, (a re-issue) where some of the files didn't download and a permanent HTTP error ensued. What I can't remember is whether or not I need to delete the folder with the model or whether BOINC will do it given time? |
Send message Joined: 5 Aug 04 Posts: 127 Credit: 24,498,085 RAC: 21,454 |
Just got another of these PNW tasks, (a re-issue) where some of the files didn't download and a permanent HTTP error ensued. What I can't remember is whether or not I need to delete the folder with the model or whether BOINC will do it given time? Having extra CPDN-folders is only a problem after a model has started. As long as one or more of the input-files is missing, the model never starts, and BOINC-client cleans-up on it's own. Note, some of the input-files can be marked as "sticky"-files, in case they're used by multiple models. "Sticky"-files is not automatically deleted. Manually deleting such files won't work either, they'll just be tried re-downloaded next time client re-starts. Depending on client-version, they won't be removed by a reset either, but if not mistaken they will be removed on reset if you're running a fairly resent v7-client. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Thanks, just seen that unit is reported as finished and gone. |
©2024 cpdn.org