climateprediction.net (CPDN) home page
Thread 'Batch 1005 WAH2 NZ region'

Thread 'Batch 1005 WAH2 NZ region'

Message boards : Number crunching : Batch 1005 WAH2 NZ region
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 70478 - Posted: 20 Feb 2024, 12:49:14 UTC - in response to Message 70465.  

I will change the configuration for the WaH workunits before any more batches go out. I'll raise the rsc_disk_bound by 20%, should be enough to avoid hitting the limit.

I suspect this is a problem now because CPDN have gradually been increasing the domain sizes for the regional models making the output files bigger.

I would suspend computation for the NZ tasks till zips clear. If you run out of work, you can turn them back on long enough to get more work then suspend them again.
Thanks, Dave. Paused it. New task started, all others are EAS25

---
CPDN Visiting Scientist
ID: 70478 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 70559 - Posted: 26 Feb 2024, 8:30:26 UTC

I still can't upload the NZ WU.

26/02/2024 10:14:32 | climateprediction.net | [http] [ID#68359] Info:  processing: http://upload11.cpdn.org/cgi-bin/file_upload_handler
26/02/2024 10:14:32 | climateprediction.net | [http] [ID#68360] Info:  processing: http://upload11.cpdn.org/cgi-bin/file_upload_handler
26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info:  Found bundle for host: 0x1f2939bf8d0 [serially]
26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info:  Connection #11805 is still name resolving, can't reuse
26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68359] Info:    Trying 192.171.169.187:80...
26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info:  Hostname 'upload11.cpdn.org' was found in DNS cache
26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info:    Trying 192.171.169.187:80...
26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68359] Info:  connect to 192.171.169.187 port 80 failed: Timed out
26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68359] Info:  Failed to connect to upload11.cpdn.org port 80 after 21237 ms: Couldn't connect to server
26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68359] Info:  Closing connection
26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68360] Info:  connect to 192.171.169.187 port 80 failed: Timed out
26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68360] Info:  Failed to connect to upload11.cpdn.org port 80 after 21236 ms: Couldn't connect to server
26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68360] Info:  Closing connection
26/02/2024 10:14:55 | climateprediction.net | [http] HTTP error: Timeout was reached
26/02/2024 10:14:55 | climateprediction.net | [http] HTTP error: Timeout was reached
26/02/2024 10:14:55 | climateprediction.net | Backing off 03:19:29 on upload of wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_10.zip
26/02/2024 10:14:55 | climateprediction.net | Backing off 05:00:23 on upload of wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_11.zip
ID: 70559 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 70560 - Posted: 26 Feb 2024, 9:02:25 UTC - in response to Message 70559.  
Last modified: 26 Feb 2024, 9:29:01 UTC

I will email Andy again.

Done. I don't recall getting an answer last time so perhaps he missed it.
ID: 70560 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 70564 - Posted: 27 Feb 2024, 21:37:07 UTC - in response to Message 70560.  

Andy tells me there have been issues with the Jasmine server. Please let me know if these persist. He didn't say explicitly that they have been resolved however.
ID: 70564 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 70566 - Posted: 28 Feb 2024, 9:32:42 UTC - in response to Message 70564.  

Yes hardware issues at the JASMIN upload have been resolved according to their last email.
---
CPDN Visiting Scientist
ID: 70566 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 70569 - Posted: 28 Feb 2024, 14:55:12 UTC - in response to Message 70566.  

Yes hardware issues at the JASMIN upload have been resolved according to their last email.

I still get the same message from upload 11 as pasted bellow. Transfers are backing off.
ID: 70569 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 70616 - Posted: 6 Mar 2024, 16:07:18 UTC - in response to Message 70569.  

Hi there,

I'm still having problems uploading the NZ unit
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  Connection 1680 seems to be dead
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  Closing connection
06/03/2024 18:03:43 |  | [http] [ID#0] Info:    Trying 142.250.184.132:443...
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  Connected to www.google.com (142.250.184.132) port 443
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  schannel: disabled automatic use of client certificate
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  ALPN: offers http/1.1
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  ALPN: server accepted http/1.1
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  using HTTP/1.1
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: GET / HTTP/1.1
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: Host: www.google.com
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.24.1)
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: Accept: */*
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: Accept-Encoding: deflate, gzip
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: Accept-Language: en_GB
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: roject_name>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <name>wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_1.zip</name>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <nbytes>90462248.000000</nbytes>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <max_nbytes>150000000.000000</max_nbytes>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <status>1</status>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <persistent_file_xfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <num_retries>107</num_retries>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <first_request_time>1707778812.154770</first_request_time>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <next_request_time>0.000000</next_request_time>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <time_so_far>2397.318125</time_so_far>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <last_bytes_xferred>0.000000</last_bytes_xferred>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <is_upload>1</is_upload>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     </persistent_file_xfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: </file_transfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: <file_transfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <project_url>https://climateprediction.net/</project_url>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <project_name>climateprediction.net</project_name>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <name>wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_2.zip</name>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <nbytes>90443142.000000</nbytes>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <max_nbytes>150000000.000000</max_nbytes>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <status>1</status>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <persistent_file_xfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <num_retries>100</num_retries>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <first_request_time>1707820366.499971</first_request_time>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <next_request_time>1709754754.739190</next_request_time>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <time_so_far>2214.724501</time_so_far>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <last_bytes_xferred>0.000000</last_bytes_xferred>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <is_upload>1</is_upload>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     </persistent_file_xfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: </file_transfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server: <file_transfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <project_url>https://climateprediction.net/</project_url>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <project_name>climateprediction.net</project_name>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <name>wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_3.zip</name>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <nbytes>90381653.000000</nbytes>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <max_nbytes>150000000.000000</max_nbytes>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <status>1</status>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:     <persistent_file_xfer>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <num_retries>95</num_retries>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <first_request_time>1707862198.360235</first_request_time>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <next_request_time>1709741916.170670</next_request_time>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <time_so_far>2150.490557</time_so_far>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:         <last_bytes_xferred>0.000000</last_bytes_xferred>
06/03/2024 18:03:43 |  | [http] [ID#0] Sent header to server:
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: HTTP/1.1 200 OK
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Date: Wed, 06 Mar 2024 16:03:43 GMT
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Expires: -1
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Cache-Control: private, max-age=0
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Content-Type: text/html; charset=ISO-8859-1
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Content-Security-Policy-Report-Only: object-src 'none';base-uri 'self';script-src 'nonce-iluVkURiYRTuWjkJQq8qLA' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Content-Encoding: gzip
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Server: gws
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: X-XSS-Protection: 0
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: X-Frame-Options: SAMEORIGIN
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Set-Cookie: SOCS=CAAaBgiA5J6vBg; expires=Sat, 05-Apr-2025 16:03:43 GMT; path=/; domain=.google.com; Secure; SameSite=lax
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Set-Cookie: AEC=Ae3NU9OtMv-AT4puuLtG9-hDGjFWyvGkxmVyNbtg8lB2WvQe8RS7H3wrJVU; expires=Mon, 02-Sep-2024 16:03:43 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Set-Cookie: __Secure-ENID=18.SE=PWV92Q812eZlxzZwIJlCCNHWpH-aP7jKE_OSUX6EKvj4cbJdUJhhOwU0DallQGiVyf8L5NElF8n0FwG6HNWuKrCeu3jBsS6HmUrMtXXsWK_BD-DSoida2VoczQev5-gCtQ3yY5f4GiKIOcNcZTAxqEalZ0rnULkORie0g3XHpCUj3CSO-w; expires=Sun, 06-Apr-2025 08:22:01 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: Transfer-Encoding: chunked
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server:
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: 00000001
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: 
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: 01
06/03/2024 18:03:43 |  | 
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: 12b3
06/03/2024 18:03:43 |  | [http] [ID#0] Received header from server: 
06/03/2024 18:03:43 |  | [http] [ID#0] Info:  Connection #1683 to host www.google.com left intact
06/03/2024 18:03:44 |  | Internet access OK - project servers may be temporarily down.
06/03/2024 18:03:44 | climateprediction.net | [http] [ID#9812] Info:  processing: http://upload11.cpdn.org/cgi-bin/file_upload_handler
06/03/2024 18:03:44 | climateprediction.net | [http] [ID#9812] Info:  Hostname upload11.cpdn.org was found in DNS cache
06/03/2024 18:03:44 | climateprediction.net | [http] [ID#9812] Info:    Trying 192.171.169.187:80...
06/03/2024 18:04:05 | climateprediction.net | [http] [ID#9812] Info:  connect to 192.171.169.187 port 80 failed: Timed out
06/03/2024 18:04:05 | climateprediction.net | [http] [ID#9812] Info:  Failed to connect to upload11.cpdn.org port 80 after 21073 ms: Couldn't connect to server
06/03/2024 18:04:05 | climateprediction.net | [http] [ID#9812] Info:  Closing connection
06/03/2024 18:04:05 | climateprediction.net | [http] HTTP error: Timeout was reached
06/03/2024 18:04:05 | climateprediction.net | Backing off 03:20:14 on upload of wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_1.zip
06/03/2024 18:04:06 |  | Project communication failed: attempting access to reference site


Any suggestions? Should I abort? I have 17 zips so far, and the WU is paused
ID: 70616 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 70619 - Posted: 6 Mar 2024, 18:10:25 UTC - in response to Message 70616.  
Last modified: 6 Mar 2024, 18:12:44 UTC

Any suggestions? Should I abort? I have 17 zips so far, and the WU is paused

I will send another email Andy's way.

Done.
ID: 70619 · Report as offensive     Reply Quote
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 30,980,040
RAC: 14,224
Message 70623 - Posted: 6 Mar 2024, 23:24:37 UTC

Mine seem to be going OK:-

06/03/2024 21:08:20 | climateprediction.net | Started upload of wah2_nz25_n316_201205_25_1005_012258088_0_r1313910418_19.zip
06/03/2024 21:08:20 | climateprediction.net | [file_xfer] URL: http://upload11.cpdn.org/cgi-bin/file_upload_handler
06/03/2024 21:08:22 | climateprediction.net | [file_xfer] http op done; retval 0 (Success)
06/03/2024 21:08:22 | climateprediction.net | [file_xfer] parsing upload response: <data_server_reply> <status>0</status> <file_size>0</file_size></data_server_reply>
06/03/2024 21:08:22 | climateprediction.net | [file_xfer] parsing status: 0
06/03/2024 21:08:22 | climateprediction.net | [fxd] starting upload, upload_offset 0
06/03/2024 21:08:57 | climateprediction.net | [file_xfer] http op done; retval 0 (Success)
06/03/2024 21:08:57 | climateprediction.net | [file_xfer] parsing upload response: <data_server_reply> <status>0</status></data_server_reply>
06/03/2024 21:08:57 | climateprediction.net | [file_xfer] parsing status: 0
06/03/2024 21:08:57 | climateprediction.net | [file_xfer] file transfer status 0 (Success)
06/03/2024 21:08:57 | climateprediction.net | Finished upload of wah2_nz25_n316_201205_25_1005_012258088_0_r1313910418_19.zip (90557393 bytes)
06/03/2024 21:08:57 | climateprediction.net | [file_xfer] Throughput 2577771 bytes/sec
ID: 70623 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 70624 - Posted: 7 Mar 2024, 8:14:35 UTC - in response to Message 70623.  
Last modified: 7 Mar 2024, 8:15:08 UTC

Mine seem to be going OK:-


Not for me though. The same message today. I wonder if i can turn on any other log option to gather more info?
ID: 70624 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 70625 - Posted: 7 Mar 2024, 10:19:05 UTC - in response to Message 70624.  

There are currently no incidents reported on the JASMIN/CEDA status page, which is where upload11.cpdn.org really is:

https://www.ceda.ac.uk/status/

upload11.cpdn.org resolves ok for me: 192.171.169.187. It's either a local DNS issue or something BOINC related.
---
CPDN Visiting Scientist
ID: 70625 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,620,508
RAC: 4,981
Message 70626 - Posted: 7 Mar 2024, 13:50:44 UTC - in response to Message 70625.  

There are currently no incidents reported on the JASMIN/CEDA status page, which is where upload11.cpdn.org really is:

https://www.ceda.ac.uk/status/

upload11.cpdn.org resolves ok for me: 192.171.169.187. It's either a local DNS issue or something BOINC related.


Ping and tracert to upload11.cpdn.org, cpdn.org and climateprediction.net timed out at the first hop at the gateway internal IP.

Tracing route to upload11.cpdn.org [192.171.169.187]
over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  192.168.1.1
  2     *        *        *     Request timed out.
  3     *        *        *     Request timed out.
  4     *        *        *     Request timed out.
  5     *        *        *     Request timed out.
  6     *        *        *     Request timed out.

I can ping and tracert other sites, though with some hops=Request time out after first external server of the ISP - 82-137-110-2.ip.btc-net.bg [82.137.110.2] . Not sure what to do. Maybe call the ISP and check why can't i see the server

C:\Users\10>tracert google.com

Tracing route to google.com [216.58.212.14]
over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  192.168.1.1
  2     1 ms    <1 ms     1 ms  82-137-110-2.ip.btc-net.bg [82.137.110.2]
  3     *        *        *     Request timed out.
  4     *        *        *     Request timed out.
  5     1 ms     1 ms     1 ms  212-39-66-222.ip.btc-net.bg [212.39.66.222]
  6     1 ms    <1 ms    <1 ms  142.251.243.215
  7     1 ms    <1 ms    <1 ms  142.250.60.19
  8    <1 ms    <1 ms    <1 ms  sof04s01-in-f14.1e100.net [216.58.212.14]

Trace complete
ID: 70626 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 70627 - Posted: 7 Mar 2024, 15:50:22 UTC - in response to Message 70626.  

Don't rely on ping & traceroute to diagnose network problems. These packets are often silently dropped by firewalls around large hosts like JASMIN/CEDA.
---
CPDN Visiting Scientist
ID: 70627 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,011,472
RAC: 21,368
Message 70628 - Posted: 7 Mar 2024, 16:17:11 UTC - in response to Message 70627.  

Don't rely on ping & traceroute to diagnose network problems. These packets are often silently dropped by firewalls around large hosts like JASMIN/CEDA.


That would explain why it only works for me some of the time.
ID: 70628 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Batch 1005 WAH2 NZ region

©2024 cpdn.org