climateprediction.net (CPDN) home page
Thread 'Download problems'

Thread 'Download problems'

Message boards : Number crunching : Download problems
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Darmok

Send message
Joined: 29 Dec 09
Posts: 34
Credit: 18,395,130
RAC: 0
Message 42249 - Posted: 26 May 2011, 1:48:01 UTC

CPDN stopped in the process of downloading HADAM3P-EU models few hours ago. All files stuck 100% in transfer. Should I abort or hope for the best?
Thanks
ID: 42249 · Report as offensive     Reply Quote
old_user654685

Send message
Joined: 20 May 11
Posts: 1
Credit: 10,581
RAC: 0
Message 42250 - Posted: 26 May 2011, 3:17:52 UTC - in response to Message 42249.  

Same issue here.

Apparently this is a problem with uploader1.atm not responding.
ID: 42250 · Report as offensive     Reply Quote
mweisensee

Send message
Joined: 29 Apr 07
Posts: 5
Credit: 1,961,201
RAC: 0
Message 42254 - Posted: 26 May 2011, 11:34:57 UTC

Same here. Please inform us in case we shall abort the WU.
ID: 42254 · Report as offensive     Reply Quote
wateroakley

Send message
Joined: 6 Aug 04
Posts: 195
Credit: 28,374,828
RAC: 10,749
Message 42259 - Posted: 26 May 2011, 15:02:46 UTC - in response to Message 42249.  

You should not need to abort models or transfers at this time. They will get completed once the server issues are resolved.

“Patience is the companion of wisdom”
Saint Augustine 354-430

ID: 42259 · Report as offensive     Reply Quote
old_user552217

Send message
Joined: 7 Jan 09
Posts: 8
Credit: 177,252
RAC: 0
Message 42283 - Posted: 29 May 2011, 21:07:42 UTC

I have had a Famous model stuck downloading for about 3 days. Please advise is a my end or a project problem. I suspect that is on the project end.
ID: 42283 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42285 - Posted: 29 May 2011, 21:31:40 UTC - in response to Message 42283.  

It's a project problem.
The University of Oxford is 'mostly' closed for the weekend, which is a long one this time because 30th May is a "bank holiday".

About 36 hours before it re-opens.


Backups: Here
ID: 42285 · Report as offensive     Reply Quote
old_user552217

Send message
Joined: 7 Jan 09
Posts: 8
Credit: 177,252
RAC: 0
Message 42293 - Posted: 31 May 2011, 6:34:32 UTC

I gave up and killed the WU. Temps are spiking here and I have no AC right now so I had to stop all DC projects until later. I will be back as soon as I have AC or the temps drop. Have to report one result and I am all done with BOINC projects. Computing is now done for all projects just one more project to report; and it is down due to server problems so I can't report the work until later this week.
ID: 42293 · Report as offensive     Reply Quote
transient

Send message
Joined: 3 Oct 06
Posts: 43
Credit: 8,017,057
RAC: 0
Message 42302 - Posted: 31 May 2011, 16:31:02 UTC

That seems a bit over-kill to me. Couldn't you simply have suspended all projects?
ID: 42302 · Report as offensive     Reply Quote
old_user552217

Send message
Joined: 7 Jan 09
Posts: 8
Credit: 177,252
RAC: 0
Message 42304 - Posted: 1 Jun 2011, 15:01:46 UTC - in response to Message 42302.  
Last modified: 1 Jun 2011, 15:06:20 UTC

That seems a bit over-kill to me. Couldn't you simply have suspended all projects?

I am totally offline now for BOINC looks like it could be over a month depending on weather until I restart BOINC. I have the program off so it would have been a long time until I got around to downloading the 3 missing files. I figured it would be faster/easier for the project to kill the one stuck WU. I reported the stuck waiting to report SETI beta result about 24 hours after I first posted and then killed the stuck downloading CPDN WU and shut it all down. With BOINC offline it is finally cool enough in the apartment to sleep again.
Also with the long potential down time I set no new tasks reported all work before the shutdown and only then killed the stuck CPDN WU, it was only one WU.
ID: 42304 · Report as offensive     Reply Quote
old_user653735

Send message
Joined: 5 May 11
Posts: 9
Credit: 53,072
RAC: 0
Message 42308 - Posted: 2 Jun 2011, 12:50:25 UTC

Hey guys,

are your download servers down?? Since hours I'm getting these status reports from BOINC and it would be nice, if I could start with that task, since it is a very long "UK Met Office Coupled Model Full Resolution Ocean v6.07" and is already calculated by BOINC to need >1000 hours! Is there maybe another way to download these files?:

02.06.2011 14:43:07 climateprediction.net Started download of hadcm3n_o17t_1940_40_007264856.zip
02.06.2011 14:43:07 climateprediction.net Started download of SPARC_O3_rebuild_1900.gz
02.06.2011 14:43:28 Project communication failed: attempting access to reference site
02.06.2011 14:43:28 climateprediction.net Temporarily failed download of hadcm3n_o17t_1940_40_007264856.zip: connect() failed
02.06.2011 14:43:28 climateprediction.net Backing off 2 hr 28 min 4 sec on download of hadcm3n_o17t_1940_40_007264856.zip
02.06.2011 14:43:28 climateprediction.net Temporarily failed download of SPARC_O3_rebuild_1900.gz: connect() failed
02.06.2011 14:43:28 climateprediction.net Backing off 3 hr 30 min 35 sec on download of SPARC_O3_rebuild_1900.gz
02.06.2011 14:43:28 climateprediction.net Started download of atmos_o17t_1940_40_007264856_0.gz
02.06.2011 14:43:28 climateprediction.net Started download of DMSSO2NH3_1900_RCP.gz
02.06.2011 14:43:29 Internet access OK - project servers may be temporarily down.
02.06.2011 14:43:50 Project communication failed: attempting access to reference site
02.06.2011 14:43:50 climateprediction.net Temporarily failed download of atmos_o17t_1940_40_007264856_0.gz: connect() failed
02.06.2011 14:43:50 climateprediction.net Backing off 3 hr 0 min 59 sec on download of atmos_o17t_1940_40_007264856_0.gz
02.06.2011 14:43:50 climateprediction.net Temporarily failed download of DMSSO2NH3_1900_RCP.gz: connect() failed
02.06.2011 14:43:50 climateprediction.net Backing off 2 hr 55 min 16 sec on download of DMSSO2NH3_1900_RCP.gz
02.06.2011 14:43:51 Internet access OK - project servers may be temporarily down.
02.06.2011 14:44:51 climateprediction.net Started download of sulpc_oxidants_19_A2_1990f.gz
02.06.2011 14:44:51 climateprediction.net Started download of spec3a_lw_3_asol2c_hadcm3.gz
02.06.2011 14:45:13 Project communication failed: attempting access to reference site
02.06.2011 14:45:13 climateprediction.net Temporarily failed download of sulpc_oxidants_19_A2_1990f.gz: connect() failed
02.06.2011 14:45:13 climateprediction.net Backing off 1 hr 1 min 26 sec on download of sulpc_oxidants_19_A2_1990f.gz
02.06.2011 14:45:13 climateprediction.net Temporarily failed download of spec3a_lw_3_asol2c_hadcm3.gz: connect() failed
02.06.2011 14:45:13 climateprediction.net Backing off 2 hr 6 min 30 sec on download of spec3a_lw_3_asol2c_hadcm3.gz
02.06.2011 14:45:14 Internet access OK - project servers may be temporarily down.
ID: 42308 · Report as offensive     Reply Quote
DaMamaJama
Avatar

Send message
Joined: 23 Dec 06
Posts: 3
Credit: 704,502
RAC: 0
Message 42309 - Posted: 2 Jun 2011, 23:16:52 UTC

Seems to be a theme - I haven't been able to download work units for almost a week now.

For the most part, I get the "Project has no jobs available", but on the odd occasion that I do get a unit, the download sits at 0.00% and/or fails completely. Not impressed.
ID: 42309 · Report as offensive     Reply Quote
old_user653735

Send message
Joined: 5 May 11
Posts: 9
Credit: 53,072
RAC: 0
Message 42310 - Posted: 2 Jun 2011, 23:25:38 UTC

It still didn't download anything here...now i got another one of these big tasks (so, 2x Full Resolution Ocean v6.07) and i can't get started with them, because the download doesn't work...what's wrong there, guys??
ID: 42310 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42311 - Posted: 3 Jun 2011, 1:35:54 UTC

A batch of RAPIT models are being created.
However, each of these is being grabbed as soon as they come off the conveyor belt.

With over 40,000 computers attached and most probably looking for work, the 'gimmee' messages from the computers are clogging up the Uni's network, (JANET), and also causing the servers to overload, which in turn is causing what downloads there are to fail.

I'm going to suggest that computers are limited to one model at a time for a while, and that the data pool is kept blocked until the batch, (only a few thousand), is fully created.


Backups: Here
ID: 42311 · Report as offensive     Reply Quote
DaMamaJama
Avatar

Send message
Joined: 23 Dec 06
Posts: 3
Credit: 704,502
RAC: 0
Message 42312 - Posted: 3 Jun 2011, 1:44:28 UTC

I won't complain then - good to see that much interest in the project. As long as there's nothing wrong with Boinc or CPDN, I'm satisfied. I thought the newer builds of Boinc were the problem. I shall patiently wait for new jobs. Thanks!
ID: 42312 · Report as offensive     Reply Quote
old_user653735

Send message
Joined: 5 May 11
Posts: 9
Credit: 53,072
RAC: 0
Message 42313 - Posted: 3 Jun 2011, 9:20:54 UTC

And what shall I do now? Just wait or abort? Because the download of the two "Full Resolution Ocean v6.07" tasks still didn't work out...
ID: 42313 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 42314 - Posted: 3 Jun 2011, 9:29:27 UTC - in response to Message 42313.  

I have stopped requesting new work - I guess it will take a lot of people to do this to stop JANET being flooded.
ID: 42314 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42315 - Posted: 3 Jun 2011, 21:17:11 UTC

The project config has now been changed slightly to reduce what is virtually a "denial of service" attack on the servers by the huge number of computers wanting work.
It should mean that less models are sent to each computer.
Anyone aborting models stuck in downloads may not get even the start of one for a while afterwards.


Backups: Here
ID: 42315 · Report as offensive     Reply Quote
Dave Roberts

Send message
Joined: 15 Jan 11
Posts: 175
Credit: 6,242,691
RAC: 699
Message 42316 - Posted: 3 Jun 2011, 23:23:39 UTC

Hi Everyone,

I think that the problems here are to some extent 'good news'. It means that there are huge numbers of people wanting to contribute to this project. It's extremely rare (if not previously unknown) for any research project to have more resources that it can usefully utilise at a particular moment in time.

I've done the same thing as Dave, suspended my requests and and am now running tasks from another project, (Malaria research in Switzerland) in order to usefully use my spare resources.

I check on most days on the current state of events with this project so I can resume when necessary.

Cheers to all involved in this project.

David

ID: 42316 · Report as offensive     Reply Quote
Profiletullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 42317 - Posted: 4 Jun 2011, 12:09:10 UTC

I am running 6 BOINC projects with a very short cache (0.25 days) which means that I get a new WU only when the preceding one has been completed and uploaded. But I have several results in a pending state, especially in SETI@home, because people download too many WUs not to remain without supplies in lean times.
Tullio
ID: 42317 · Report as offensive     Reply Quote
old_user653735

Send message
Joined: 5 May 11
Posts: 9
Credit: 53,072
RAC: 0
Message 42320 - Posted: 4 Jun 2011, 20:21:25 UTC

I still don't have any change...both of the "Full Resolution Ocean v6.07" tasks are still trying to be downloaded...is there any chance, that it will happen soon? I really would like to crunch them both! :)
Cheers
ID: 42320 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Download problems

©2024 cpdn.org