climateprediction.net (CPDN) home page
Thread 'MORE FAILED DOWNLOADS'

Thread 'MORE FAILED DOWNLOADS'

Message boards : Number crunching : MORE FAILED DOWNLOADS
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 56151 - Posted: 5 May 2017, 14:41:07 UTC

More failed downloads from batches 526, 533, and 535

5/5/2017 10:27:29 AM | climateprediction.net | Temporarily failed download of GHGclim_ancil_14months_OSTIA_sst_v2_CNRM-CM5_1998-12-01_2000-01-30.gz: connect() failed
5/5/2017 10:27:29 AM | climateprediction.net | Backing off 00:03:07 on download of GHGclim_ancil_14months_OSTIA_sst_v2_CNRM-CM5_1998-12-01_2000-01-30.gz
5/5/2017 10:27:49 AM | climateprediction.net | Temporarily failed download of ALLclim_ancil_14months_OSTIA_ice_1998-12-01_2000-01-30.gz: connect() failed
5/5/2017 10:27:49 AM | climateprediction.net | Backing off 00:03:19 on download of ALLclim_ancil_14months_OSTIA_ice_1998-12-01_2000-01-30.gz
5/5/2017 10:27:50 AM | | Project communication failed: attempting access to reference site
5/5/2017 10:27:51 AM | | Internet access OK - project servers may be temporarily down.
ID: 56151 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56152 - Posted: 5 May 2017, 17:11:00 UTC

A message has gone to the project informing them that this is being discussed on various threads. I am not aborting anything because it looks like a global problem rather than being specific batches as has sometimes happened in the past.
ID: 56152 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,709,333
RAC: 5,769
Message 56153 - Posted: 5 May 2017, 19:40:04 UTC
Last modified: 5 May 2017, 19:40:16 UTC

I've just got few hanging. It seems project people were not able to stop this, so more hangings should be expected as the weekend came. Best thing to set "No new tasks".
ID: 56153 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 56159 - Posted: 6 May 2017, 4:42:34 UTC

Is it likely that once the download problem is fixed the stuck WU’s will be able to complete the download and run successfully? If not I will abort the 7 taking up space on my machines.
ID: 56159 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56162 - Posted: 6 May 2017, 7:17:40 UTC - in response to Message 56159.  

I think these are all tasks that have failed at least once - presence of _1 or _2 at the end of task name. In another thread someone has one from batch 40 6 which timed out after 11 months on a machine but that could be either failed download or the machine just having stopped running BOINC. There was one batch that did have a problem with files not being where they should have been. Not sure I have the energy to trawl through old messages to work out which one it is.

My own view is that as this problem seems to affect all batch numbers, the vast majority should upload once the problem is fixed.
ID: 56162 · Report as offensive     Reply Quote
BetelgeuseFive

Send message
Joined: 31 Aug 04
Posts: 10
Credit: 2,538,005
RAC: 0
Message 56164 - Posted: 6 May 2017, 8:15:25 UTC

I don't think the problem is that the files are not there. The problem seems to be that the server cannot be found:

06/05/2017 10:00:39 | climateprediction.net | Temporarily failed download of waterfix.ancil.be.32.gz: connect() failed

Maybe after all the recent changes something is wrong with the DNS settings ?
ID: 56164 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,884,997
RAC: 4,577
Message 56170 - Posted: 6 May 2017, 9:54:09 UTC - in response to Message 56164.  

... Maybe after all the recent changes something is wrong with the DNS settings ?

Sounds quite likely. The stalled downloads have gone on too long to be a propagation problem.
ID: 56170 · Report as offensive     Reply Quote
ryan

Send message
Joined: 12 Nov 10
Posts: 4
Credit: 302,337
RAC: 0
Message 56172 - Posted: 6 May 2017, 16:07:17 UTC

downloads still failing
ID: 56172 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56173 - Posted: 7 May 2017, 8:16:13 UTC - in response to Message 56172.  

downloads still failing


No chance of this changing before Monday when the project staff will be back at work after the weekend and (guessing here) may not be fixed until after file system upgrade on Wednesday though the latter is unrelated to the problem.
ID: 56173 · Report as offensive     Reply Quote
nedsram-cdl

Send message
Joined: 14 Apr 05
Posts: 31
Credit: 16,491,691
RAC: 0
Message 56179 - Posted: 8 May 2017, 21:20:51 UTC - in response to Message 56173.  

I think that BetelgeuseFive may be correct. I've had several downloads stalled for days now, which repeatedly fail to connect to the server. However the server status page is showing all servers as up and running. Or at least it was. Now I'm not getting any response at all from it.
Brian
ID: 56179 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 56180 - Posted: 8 May 2017, 23:13:13 UTC

Since it didn’t get fixed today (Monday) I think that we will have to wait until the after the Wednesday file upgrade. The Staff will most likely be too busy Tuesday getting ready to tackle the problem till after the upgrade is complete.
ID: 56180 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56181 - Posted: 9 May 2017, 5:25:09 UTC

While the server status page is again claiming all is well, I concur with Jim.
ID: 56181 · Report as offensive     Reply Quote
ProfileSteve Dodd

Send message
Joined: 28 Oct 11
Posts: 15
Credit: 9,974,078
RAC: 2,643
Message 56185 - Posted: 9 May 2017, 22:59:48 UTC - in response to Message 56151.  
Last modified: 9 May 2017, 23:00:09 UTC

My one WU has failed to download for over 5 days! And I really, really want to crunch it :).
ID: 56185 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56187 - Posted: 10 May 2017, 5:55:31 UTC - in response to Message 56185.  

I am down to my last running task and the only reason that is still running is that it is on an ageing netbook powered by a very slow dual cored atom processor!
ID: 56187 · Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 5 Sep 04
Posts: 2
Credit: 9,012,838
RAC: 0
Message 56188 - Posted: 10 May 2017, 9:39:01 UTC - in response to Message 56187.  

I just got a message in my event log after several days without connecting:
"Project temporarily shut down for maintenance."
ID: 56188 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 56189 - Posted: 10 May 2017, 14:11:49 UTC - in response to Message 56188.  

I just got a message in my event log after several days without connecting:
"Project temporarily shut down for maintenance."


The project is undergoing a system file upgrade today. It should be finished by tomorrow.
ID: 56189 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 56190 - Posted: 10 May 2017, 16:20:11 UTC

The maintenance is larger than CPDN. It involves U.Oxford IT. Latest from CPDN staff:

Hi All,

Update on this work: most of work has now been completed, however the cluster NFS service is still considered at risk. CPDN is the main user of this service. This service will be checked and tested tomorrow morning, so we will be keeping the project offline until this service is verified and considered to be not at risk.

With regards,

Andy

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 56190 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56199 - Posted: 11 May 2017, 13:45:51 UTC

Latest update: Presumably someone is now free to work on the stuck uploads problem.

Hi All,

We have now been informed by OeRC IT Support that it is now safe to restart the services of the project again. So the project is now back online and the uploads that are set to go to Oxford have now been re-diverted to go to Oxford again.

With regards,

Andy
ID: 56199 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56203 - Posted: 12 May 2017, 6:18:31 UTC - in response to Message 56199.  
Last modified: 12 May 2017, 6:40:03 UTC

Oops!

Stuck downloads.

Edit: I notice that the executeables download on a fresh install and it is only the batch/task specific files that get stuck.Which would support the DNS issue theory.

Have sent another email to project.
ID: 56203 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 56206 - Posted: 12 May 2017, 11:43:59 UTC

And my stalled downloads are now no longer stalled. Whoop Whoop. Should be some more work later today too.
ID: 56206 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : MORE FAILED DOWNLOADS

©2024 cpdn.org