climateprediction.net (CPDN) home page
Thread 'Download Failed'

Thread 'Download Failed'

Message boards : Number crunching : Download Failed
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
james

Send message
Joined: 15 Dec 06
Posts: 13
Credit: 2,539,487
RAC: 0
Message 45416 - Posted: 7 Jan 2013, 1:27:15 UTC

"Download Failed" err. mess., HASAM3P Pacific Northwest 6.09

WU: hadam30_pnw_8rsv_1199_1_8987_1

The second WU to fail within the last 1+ days.


Info., from the BOINC Event Log:

1/6/2013 7:18:04 PM | climateprediction.net | Sending scheduler request: To fetch work.
1/6/2013 7:18:04 PM | climateprediction.net | Requesting new tasks for CPU
1/6/2013 7:18:06 PM | climateprediction.net | Scheduler request completed: got 1 new tasks
1/6/2013 7:18:09 PM | climateprediction.net | Started download of hadam3p_pnw_8rsv_1999_1_007708987.zip
1/6/2013 7:18:09 PM | climateprediction.net | Started download of xaclfa.start.0000.gz
1/6/2013 7:18:11 PM | climateprediction.net | Giving up on download of hadam3p_pnw_8rsv_1999_1_007708987.zip: permanent HTTP error
1/6/2013 7:18:11 PM | climateprediction.net | Started download of so2dms_N96_1999_12_2001_02f.gz
1/6/2013 7:18:16 PM | climateprediction.net | Finished download of so2dms_N96_1999_12_2001_02f.gz
1/6/2013 7:18:16 PM | climateprediction.net | Started download of dchaba.start.pnw.b.0000.gz
1/6/2013 7:18:27 PM | climateprediction.net | Finished download of dchaba.start.pnw.b.0000.gz
1/6/2013 7:18:27 PM | climateprediction.net | Started download of HadISST_SI_N96_1999_12_2001_01f.gz
1/6/2013 7:18:29 PM | climateprediction.net | Finished download of HadISST_SI_N96_1999_12_2001_01f.gz
1/6/2013 7:18:29 PM | climateprediction.net | Started download of HadISST_SST_N96_1999_12_2001_01f.gz
1/6/2013 7:18:36 PM | climateprediction.net | Finished download of HadISST_SST_N96_1999_12_2001_01f.gz
1/6/2013 7:18:51 PM | climateprediction.net | Finished download of xaclfa.start.0000.gz

*

This problem is new; other WU's
have been downloaded/completed successfully.

Any suggestions?

Thanks, in advance.

Thanks, in advance.
ID: 45416 · Report as offensive     Reply Quote
ProfileJoe's Climate
Avatar

Send message
Joined: 10 Dec 11
Posts: 11
Credit: 253,758
RAC: 3
Message 45420 - Posted: 8 Jan 2013, 0:21:00 UTC - in response to Message 45416.  

Check if you have ample harddrive space. Some work units are very large.
If you have boinc running on vfat, you may want to consider running it in ntfs since vfat is limited to 2GB file sizes.
In boinc, there is an option to test/verify the integrity of your file downloads since some ISPs may alter your downloads - maybe you'll want to turn that on.
Make sure your computer is up to date with all patches and upgrades.


If the WU fails to download, let boinc kill the download and cleanup itself.
The reason for mentioning that is because some projects tended to have a lot of this happen and users got in the habit of killing failed downloads - which sort of messed things up on the project's side, and some projects actually penalized your computer for that sort of stuff making it harder to download a new replacement project.

I don't know if any of the above will help, but hopefully it might.
ID: 45420 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 45421 - Posted: 8 Jan 2013, 6:17:46 UTC - in response to Message 45420.  

Check if you have ample harddrive space. Some work units are very large.
If you have boinc running on vfat, you may want to consider running it in ntfs since vfat is limited to 2GB file sizes.
In boinc, there is an option to test/verify the integrity of your file downloads since some ISPs may alter your downloads - maybe you'll want to turn that on.
Make sure your computer is up to date with all patches and upgrades.


If the WU fails to download, let boinc kill the download and cleanup itself.
The reason for mentioning that is because some projects tended to have a lot of this happen and users got in the habit of killing failed downloads - which sort of messed things up on the project's side, and some projects actually penalized your computer for that sort of stuff making it harder to download a new replacement project.

I don't know if any of the above will help, but hopefully it might.


What you recommend sounds reasonable.
There have been some broken wu that fail to download because some files are missing on the server. This was supposed to have been fixed but it is possible that some broken wu are still out there on the server.

Aside from that small possibility -- indeed - make sure you have space for downloads -- some here on cpdn are rather large.

Likewise what Joe said about keeping up to date with the BOINC software also good idea.

ID: 45421 · Report as offensive     Reply Quote
james

Send message
Joined: 15 Dec 06
Posts: 13
Credit: 2,539,487
RAC: 0
Message 45427 - Posted: 10 Jan 2013, 0:22:32 UTC
Last modified: 10 Jan 2013, 0:23:06 UTC

Thank you, gentlemen. Although I think my computer (Sony VAIO, dual-cpu's,
Windows Vista Business) has enough disk space, will continue to run WU's
and monitor things. Some very earlier WU's were over 3,000 hrs. long, with
no disk space pblms.

cheers,
ID: 45427 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 45428 - Posted: 10 Jan 2013, 2:41:49 UTC

James

It's possible that it's a leftover from the server problems back in October/November, or a similar repeat.

See my post here for slightly more details.

Either way, downloads with permanent failures are "one of those things".

More work will come along eventually.


Backups: Here
ID: 45428 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 45429 - Posted: 10 Jan 2013, 11:45:34 UTC

Windows Vista Business) has enough disk space,


Clicking on the Disk tab in BOINC Manager will confirm that you have plenty of disk space. - On this machine it tells me I have over 26GB free available to BOINC. On my other machine a netbook it is 7.07GB free with 2.93GB used by BOINC. If the free space were to drop below 4GB on that machine I would start looking to see if crashed tasks were taking up space. I think 10GB available for BOINC should be fine for any dual core machine. If I had 4 or more I might want to up it a bit.
ID: 45429 · Report as offensive     Reply Quote
james

Send message
Joined: 15 Dec 06
Posts: 13
Credit: 2,539,487
RAC: 0
Message 45436 - Posted: 11 Jan 2013, 1:36:25 UTC - in response to Message 45429.  

Dave -- 42.63 GB is available, per "Disk Space". So, the pblms. must
reside on CPDN's side of things. No anxiety, from my end -- just a
wait for the next WU.

Cheers, to all.
ID: 45436 · Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 21 Oct 10
Posts: 53
Credit: 2,101,753
RAC: 3,985
Message 45444 - Posted: 12 Jan 2013, 13:24:36 UTC

Same issue here, got 2 where it seems to be able to download all files except one which ends in "permanent download failure" and the WU goes error... I know CPDN is going through a long period of almost no WU but it's giving false hope when this happens.

Luckily last time I got 2 very long WUs that lasted 650 hours... hopefully I'll get others like those ones.
ID: 45444 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 45445 - Posted: 12 Jan 2013, 14:07:22 UTC - in response to Message 45444.  
Last modified: 12 Jan 2013, 14:09:40 UTC

Check Les's last post in this thread, Might as well abort the unit it won't ever download properly. It is a re-issue of one that has failed to download before. With a bit of luck the stock of these is running out now so there shouldn't be too many more of them. And as Les posted in another thread the next batch of hadmc3n's is a few weeks away. Doesn't know about the hadam3p regional models. I have just re-enabled World Community Grid on my other machine which had a fee core.I am just allowing it to get a couple of days work at a time in case there are some regional models coming up soon.
ID: 45445 · Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 21 Oct 10
Posts: 53
Credit: 2,101,753
RAC: 3,985
Message 45451 - Posted: 13 Jan 2013, 11:24:52 UTC

Yes I saw that, no problem for me, it's just I can only run CPDN on that computer where boinc cannot have its own access to Internet (corporate) so I move WU with USB key, and I can only do this with long running kind of WU, only CPDN is long enough to let me do that... so I'll wait :)
ID: 45451 · Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 21 Oct 10
Posts: 53
Credit: 2,101,753
RAC: 3,985
Message 45452 - Posted: 14 Jan 2013, 20:56:00 UTC

I got 2 long ones ! Happy :)
ID: 45452 · Report as offensive     Reply Quote
james

Send message
Joined: 15 Dec 06
Posts: 13
Credit: 2,539,487
RAC: 0
Message 45468 - Posted: 18 Jan 2013, 2:48:01 UTC

Follow-up: Rec'd. an hadcm3n WU, and no pblms, after approx. 4% run.
A good long one: about 2200 hrs.

ID: 45468 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 45470 - Posted: 18 Jan 2013, 5:02:10 UTC - in response to Message 45468.  

2200 hours! What are you running it on, a pocket calculator? My computers are no speed demons, but, even the 1.5 GHz machine finishes an Hadm3n WU in about 900 hours.

ID: 45470 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 45476 - Posted: 18 Jan 2013, 7:51:36 UTC - in response to Message 45470.  

2200 hours! What are you running it on, a pocket calculator?


Over 3000 hours on my dual core atom netbook. Less than a thousand on this machine.
ID: 45476 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 45502 - Posted: 26 Jan 2013, 13:00:18 UTC

Another download failure in case it needs to be passed on to anyone @ the project.

Sat 26 Jan 2013 12:01:02 GMT | climateprediction.net | Started download of hadcm3n_o6l4_2100_40_008239984.zip
Sat 26 Jan 2013 12:01:02 GMT | climateprediction.net | Started download of ocean_o6l4_2100_40_008239984_0.gz
Sat 26 Jan 2013 12:01:02 GMT | climateprediction.net | File SPARC_O3_rebuild_1900.gz exists already, skipping download
Sat 26 Jan 2013 12:01:03 GMT | climateprediction.net | Finished download of hadcm3n_o6l4_2100_40_008239984.zip
Sat 26 Jan 2013 12:01:03 GMT | climateprediction.net | Giving up on download of ocean_o6l4_2100_40_008239984_0.gz: permanent HTTP error
Sat 26 Jan 2013 12:01:03 GMT | climateprediction.net | Started download of atmos_o6l4_2100_40_008239984_0.gz
Sat 26 Jan 2013 12:01:03 GMT | climateprediction.net | Started download of DMSSO2NH3_1900_RCP.gz
Sat 26 Jan 2013 12:01:04 GMT | climateprediction.net | Giving up on download of atmos_o6l4_2100_40_008239984_0.gz: permanent HTTP error
Sat 26 Jan 2013 12:01:04 GMT | climateprediction.net | Finished download of DMSSO2NH3_1900_RCP.gz
ID: 45502 · Report as offensive     Reply Quote
Profiletullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 45503 - Posted: 26 Jan 2013, 15:08:34 UTC

Me too. I have a download failure. What must I do?
Tullio
ID: 45503 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 45505 - Posted: 26 Jan 2013, 17:27:17 UTC - in response to Message 45503.  

Nothing you can do, tullio. You probably received a regenerated task from an earlier failed batch. See Les' post here:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7527&nowrap=true#45428

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 45505 · Report as offensive     Reply Quote
james

Send message
Joined: 15 Dec 06
Posts: 13
Credit: 2,539,487
RAC: 0
Message 45519 - Posted: 30 Jan 2013, 3:21:05 UTC

The 2200hrs figure is the initial estimated time to completion. My
dual-core Sony VIAO usually finishes before that max. figure.

Some years, ago, I had a WU of over 3300hrs (Those were the good old days.).
It behooved a person to do periodic saves. . .

ID: 45519 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 45522 - Posted: 30 Jan 2013, 18:48:51 UTC

When running a model type that is going to take 1000+ hours I find that making periodic backups is a must. Over a period of months something is bound to go wrong. Systems can lockup requiring a cold reboot, power can fail, or system hardware can fail. Any of these can wipe out thousands of work and kill a good model.

Reboots after updating system software are a good time to make backups. You had to exit the model and shutdown boinc anyway so why not make a backup at that point.

ID: 45522 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 45536 - Posted: 5 Feb 2013, 16:21:31 UTC

It looks like we have more bad downloads. Hadam3p_pnw_2vjk_1970_1_008294027_1 is presently stuck in the transfer tab.

Fortunately, I also received 2 usable downloads (a hadcm3n and a hadam3p_eu) that downloaded just fine, so the problem is not general. Maybe the PNW is from an old, flawed batch?


ID: 45536 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Download Failed

©2024 cpdn.org