climateprediction.net (CPDN) home page
Thread 'PNW upload issues'

Thread 'PNW upload issues'

Message boards : Number crunching : PNW upload issues
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
MarkJ
Avatar

Send message
Joined: 28 Mar 09
Posts: 126
Credit: 9,825,980
RAC: 0
Message 52650 - Posted: 2 Oct 2015, 10:03:11 UTC

Is anyone else having issues uploading? In the last couple of days I have seen all my PNW zip files stuck trying to upload. Between 6 machines I have over 180 zip files that don't seem to want to start their transfers. BOINC is saying they are Transient HTTP errors.

Internet access seems fine, even to the CPDN web site and scheduler requests seem to work. I've tried resetting the router and flushing DNS cache.

They are all the recent pnw_x??? series work units downloaded in the last week or so. Machines are all Win10 x64 and running BOINC 7.6.9.
BOINC blog
ID: 52650 · Report as offensive     Reply Quote
MarkJ
Avatar

Send message
Joined: 28 Mar 09
Posts: 126
Credit: 9,825,980
RAC: 0
Message 52653 - Posted: 2 Oct 2015, 11:50:18 UTC
Last modified: 2 Oct 2015, 12:21:41 UTC

And the http_debug output...


climateprediction.net 02-10-2015 09:38 PM [http] HTTP_OP::libcurl_exec(): ca-bundle 'C:\Program Files\BOINC\ca-bundle.crt'
climateprediction.net 02-10-2015 09:38 PM [http] HTTP_OP::libcurl_exec(): ca-bundle set
climateprediction.net 02-10-2015 09:38 PM Started upload of hadam3p_pnw_xfiq_2002_1_010348924_0_10.zip
climateprediction.net 02-10-2015 09:38 PM [http] HTTP_OP::libcurl_exec(): ca-bundle 'C:\Program Files\BOINC\ca-bundle.crt'
climateprediction.net 02-10-2015 09:38 PM [http] HTTP_OP::libcurl_exec(): ca-bundle set
climateprediction.net 02-10-2015 09:38 PM Started upload of hadam3p_pnw_xfiq_2002_1_010348924_0_11.zip
climateprediction.net 02-10-2015 09:38 PM [http] [ID#26] Info: timeout on name lookup is not supported
climateprediction.net 02-10-2015 09:38 PM [http] [ID#26] Info: Hostname was NOT found in DNS cache
climateprediction.net 02-10-2015 09:38 PM [http] [ID#26] Info: Trying 128.193.64.193...
climateprediction.net 02-10-2015 09:38 PM [http] [ID#27] Info: Found bundle for host boinc1.coas.oregonstate.edu: 0x2696e20
climateprediction.net 02-10-2015 09:38 PM [http] [ID#27] Info: timeout on name lookup is not supported
climateprediction.net 02-10-2015 09:38 PM [http] [ID#27] Info: Hostname was found in DNS cache
climateprediction.net 02-10-2015 09:38 PM [http] [ID#27] Info: Trying 128.193.64.193...
climateprediction.net 02-10-2015 09:38 PM [http] [ID#26] Info: connect to 128.193.64.193 port 80 failed: Timed out
climateprediction.net 02-10-2015 09:38 PM [http] [ID#26] Info: Failed to connect to boinc1.coas.oregonstate.edu port 80: Timed out
climateprediction.net 02-10-2015 09:38 PM [http] [ID#26] Info: Closing connection 25
climateprediction.net 02-10-2015 09:38 PM [http] [ID#27] Info: connect to 128.193.64.193 port 80 failed: Timed out
climateprediction.net 02-10-2015 09:38 PM [http] [ID#27] Info: Failed to connect to boinc1.coas.oregonstate.edu port 80: Timed out
climateprediction.net 02-10-2015 09:38 PM [http] [ID#27] Info: Closing connection 26
climateprediction.net 02-10-2015 09:38 PM [http] HTTP error: Couldn't connect to server
climateprediction.net 02-10-2015 09:38 PM [http] HTTP error: Couldn't connect to server

And then it happily connects to Google as the reference site.

It seems its trying to connect to boinc1.coas.oregonstate.edu and timing out.

ping
C:\windows\system32>ping boinc1.coas.oregonstate.edu

Pinging maui.oce.orst.edu [128.193.64.193] with 32 bytes of data:
Reply from 128.193.88.130: Destination host unreachable.
Reply from 128.193.88.130: Destination host unreachable.
Request timed out.
Reply from 128.193.88.130: Destination host unreachable.

Ping statistics for 128.193.64.193:
Packets: Sent = 4, Received = 3, Lost = 1 (25% loss),
BOINC blog
ID: 52653 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,008,987
RAC: 21,524
Message 52654 - Posted: 2 Oct 2015, 12:00:09 UTC - in response to Message 52653.  

Server problems. It must be Friday! Linux boxes don't run those tasks and the ones I have all go to different servers from the pnw ones. I don't know about the Oregon servers but the Oxford ones don't get fixed at weekends!

Incidentally, while checking, I noticed that credit hasn't gone up for a few days and that is Oxford!
ID: 52654 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 52655 - Posted: 2 Oct 2015, 13:10:37 UTC

I have the same problem. One zip file stuck in the transfer tab. It is Friday isn�t it.

10/2/2015 8:16:25 AM | climateprediction.net | Started upload of hadam3p_pnw_feiu_2012_1_010057325_2_1.zip
10/2/2015 8:16:47 AM | climateprediction.net | Temporarily failed upload of hadam3p_pnw_feiu_2012_1_010057325_2_1.zip: connect() failed
10/2/2015 8:16:47 AM | climateprediction.net | Backing off 01:07:25 on upload of hadam3p_pnw_feiu_2012_1_010057325_2_1.zip
10/2/2015 8:16:49 AM | | Project communication failed: attempting access to reference site
10/2/2015 8:16:50 AM | | Internet access OK - project servers may be temporarily down.
10/2/2015 9:01:34 AM | climateprediction.net | Started upload of hadam3p_pnw_feiu_2012_1_010057325_2_1.zip
10/2/2015 9:01:56 AM | climateprediction.net | Temporarily failed upload of hadam3p_pnw_feiu_2012_1_010057325_2_1.zip: connect() failed
10/2/2015 9:01:56 AM | climateprediction.net | Backing off 04:15:25 on upload of hadam3p_pnw_feiu_2012_1_010057325_2_1.zip
10/2/2015 9:01:59 AM | | Project communication failed: attempting access to reference site
10/2/2015 9:02:00 AM | | Internet access OK - project servers may be temporarily down.

And the credits do seem to be stuck since yesterday.
ID: 52655 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 52659 - Posted: 2 Oct 2015, 21:37:18 UTC

One flavor of PNW tasks had upload problems: "x" series. Oxford was notified yesterday and uploads are now in progress.

On the off-topic issue, have you checked credits recently?
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 52659 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 52660 - Posted: 2 Oct 2015, 22:02:53 UTC

The stuck hadam3p_pnw zip files have started uploading.
ID: 52660 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 52662 - Posted: 3 Oct 2015, 1:55:44 UTC

All my backed up zip files have now cleared.
ID: 52662 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,008,987
RAC: 21,524
Message 52663 - Posted: 3 Oct 2015, 6:37:50 UTC

On the off topic issue, credits have gone up. Don't know if the increaseed allocation for WAH tasks is there yet as my Linux box doesn't run them.
ID: 52663 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 52721 - Posted: 26 Oct 2015, 18:32:48 UTC

Message from Prof.David Wallom:

We have now restarted the majority of services within the CPDN project including all central BOINC services. We are restarting the local (Oxford) upload server shortly in a manner that will allow up to control load into the main infrastructure so those waiting to upload there please be patient.


Everything is in the green but none of my 413 queued .zip files managed to get through the load -- yet. The operative part of the message is services are restored but PATIENCE is required.

'.zip' files sent to Oregon State U. and tasks marked complete 'reported' this morning (GMT-8 time zone). It's a start . . .
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 52721 · Report as offensive     Reply Quote
BelgianEnthousiast

Send message
Joined: 29 Apr 15
Posts: 9
Credit: 2,174,905
RAC: 0
Message 52728 - Posted: 27 Oct 2015, 8:30:42 UTC

Hi Everybody,
I'm also having 7 WU's ready and awaiting upload :
27/10/2015 7:17:01 | climateprediction.net | Started upload of hadam3p_eu_f3pr_1995_1_010196707_2_11.zip
27/10/2015 7:17:01 | climateprediction.net | Started upload of hadam3p_eu_f3pr_1995_1_010196707_2_12.zip
27/10/2015 7:17:23 | climateprediction.net | Temporarily failed upload of hadam3p_eu_f3pr_1995_1_010196707_2_11.zip: connect() failed
27/10/2015 7:17:23 | climateprediction.net | Backing off 03:27:27 on upload of hadam3p_eu_f3pr_1995_1_010196707_2_11.zip
27/10/2015 7:17:23 | climateprediction.net | Temporarily failed upload of hadam3p_eu_f3pr_1995_1_010196707_2_12.zip: connect() failed
27/10/2015 7:17:23 | climateprediction.net | Backing off 03:04:54 on upload of hadam3p_eu_f3pr_1995_1_010196707_2_12.zip
27/10/2015 7:17:26 | | Project communication failed: attempting access to reference site
27/10/2015 7:17:27 | | Internet access OK - project servers may be temporarily down.
I have this now since 14/10 (I'm aware that the project servers were down for a while, but they seem to be up and running again now)
I'm not able to download new WU's either.

Is it possible to investigate both topics please ?

Many thanks in advance !

K.
ID: 52728 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,008,987
RAC: 21,524
Message 52729 - Posted: 27 Oct 2015, 8:48:41 UTC - in response to Message 52728.  

I downloaded two new workunits yesterday. The server status page http://climateapps2.oerc.ox.ac.uk/cpdnboinc/server_status.html is showing that the only tasks available at the moment are for Linux. I suspect that the only problem with sending zips is that the servers are really getting hammered . I have something like 2GB waiting to go between my two machines. I have suspended network activity after downloading some more work and unless I see signs that things are moving won't try again till tomorrow. I don't really know if there are enough of us making our machines wait to make a difference but...........
ID: 52729 · Report as offensive     Reply Quote
ruffieux

Send message
Joined: 21 Oct 07
Posts: 8
Credit: 12,864,838
RAC: 399
Message 52732 - Posted: 27 Oct 2015, 9:14:33 UTC - in response to Message 52728.  

I do have a problem with the upload as well (running on Linux). Is there a solution available already?
ID: 52732 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,008,987
RAC: 21,524
Message 52733 - Posted: 27 Oct 2015, 11:32:46 UTC - in response to Message 52732.  

I do have a problem with the upload as well (running on Linux). Is there a solution available already?


The solution is to wait until the vast number of crunchers who set and forget have uploaded their backlog and the number of requests to upload work reduces to something within the bandwidth the servers can cope with.
ID: 52733 · Report as offensive     Reply Quote
ProfileDr_Mabuse

Send message
Joined: 21 Feb 05
Posts: 24
Credit: 991,032
RAC: 0
Message 52734 - Posted: 27 Oct 2015, 11:46:08 UTC

I cound't upload results for more than 1 week.
These are the error messages:
27.10.2015 12:43:30 | climateprediction.net | Started upload of hadam3p_pnw_g7xd_2015_0_010298312_0_6.zip
27.10.2015 12:43:53 | climateprediction.net | Temporarily failed upload of hadam3p_pnw_g7xd_2015_0_010298312_0_6.zip: connect() failed
27.10.2015 12:43:53 | climateprediction.net | Backing off 05:59:08 on upload of hadam3p_pnw_g7xd_2015_0_010298312_0_6.zip

Please have a look and give an advice !
thanks
Jochen
*** Since I'm a fool I prooved that the system is not foolproof ;-) ***
ID: 52734 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,008,987
RAC: 21,524
Message 52736 - Posted: 27 Oct 2015, 12:41:24 UTC - in response to Message 52734.  

I cound't upload results for more than 1 week.


That is because the servers have been down for more than a week.
ID: 52736 · Report as offensive     Reply Quote
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 17 Aug 04
Posts: 289
Credit: 44,103,664
RAC: 0
Message 52737 - Posted: 27 Oct 2015, 14:35:36 UTC

my .zip files for hadam3p_afr_eluy_2011 just started to upload 27 Oct 2015, 14:12:19 UTC
ID: 52737 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,008,987
RAC: 21,524
Message 52738 - Posted: 27 Oct 2015, 14:45:38 UTC - in response to Message 52737.  

I thik I will still wait till tomorrow before trying on the laptop which has over 1.5GB worth of zip files to send.
ID: 52738 · Report as offensive     Reply Quote
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 17 Aug 04
Posts: 289
Credit: 44,103,664
RAC: 0
Message 52739 - Posted: 27 Oct 2015, 17:00:28 UTC
Last modified: 27 Oct 2015, 17:44:17 UTC

Oops I think I spoke too soon,
the following BOINC log event as of: 16:53 UTC:



10/27/2015 9:33:42 AM | climateprediction.net | Temporarily failed upload of hadam3p_afr_e1uy_2011_1_010355437_0_3.zip: transient HTTP error
10/27/2015 9:33:42 AM | climateprediction.net | Backing off 03:43:10 on upload of hadam3p_afr_e1uy_2011_1_010355437_0_3.zip
10/27/2015 9:33:43 AM | | Internet access OK - project servers may be temporarily down.
10/27/2015 9:38:13 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_f4b0_1996_1_010197472_1_9.zip: transient HTTP error
10/27/2015 9:38:13 AM | climateprediction.net | Backing off 04:21:34 on upload of hadam3p_eu_f4b0_1996_1_010197472_1_9.zip
10/27/2015 9:38:14 AM | | Project communication failed: attempting access to reference site
10/27/2015 9:38:18 AM | | Internet access OK - project servers may be temporarily down.

The sever must be getting hammered with all the uploads,
and so I think I will suspend my network activity to give the servers a break.
ID: 52739 · Report as offensive     Reply Quote
Kevin

Send message
Joined: 5 Jul 09
Posts: 63
Credit: 6,091,274
RAC: 0
Message 52740 - Posted: 27 Oct 2015, 17:07:57 UTC

I only had 10 zips waiting, they went sometime this afternoon, both tasks have reported as completed.



ID: 52740 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 52741 - Posted: 27 Oct 2015, 17:10:04 UTC

The servers have been back online for a couple days now, but I still cannot get even a single byte uploaded to them. All connections time-out.
ID: 52741 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : PNW upload issues

©2024 cpdn.org