Message boards : Number crunching : Batch 1005 WAH2 NZ region
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,490,541 RAC: 15,784 |
I will change the configuration for the WaH workunits before any more batches go out. I'll raise the rsc_disk_bound by 20%, should be enough to avoid hitting the limit. I suspect this is a problem now because CPDN have gradually been increasing the domain sizes for the regional models making the output files bigger. I would suspend computation for the NZ tasks till zips clear. If you run out of work, you can turn them back on long enough to get more work then suspend them again.Thanks, Dave. Paused it. New task started, all others are EAS25 --- CPDN Visiting Scientist |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
I still can't upload the NZ WU. 26/02/2024 10:14:32 | climateprediction.net | [http] [ID#68359] Info: processing: http://upload11.cpdn.org/cgi-bin/file_upload_handler 26/02/2024 10:14:32 | climateprediction.net | [http] [ID#68360] Info: processing: http://upload11.cpdn.org/cgi-bin/file_upload_handler 26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info: Found bundle for host: 0x1f2939bf8d0 [serially] 26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info: Connection #11805 is still name resolving, can't reuse 26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68359] Info: Trying 192.171.169.187:80... 26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info: Hostname 'upload11.cpdn.org' was found in DNS cache 26/02/2024 10:14:33 | climateprediction.net | [http] [ID#68360] Info: Trying 192.171.169.187:80... 26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68359] Info: connect to 192.171.169.187 port 80 failed: Timed out 26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68359] Info: Failed to connect to upload11.cpdn.org port 80 after 21237 ms: Couldn't connect to server 26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68359] Info: Closing connection 26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68360] Info: connect to 192.171.169.187 port 80 failed: Timed out 26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68360] Info: Failed to connect to upload11.cpdn.org port 80 after 21236 ms: Couldn't connect to server 26/02/2024 10:14:55 | climateprediction.net | [http] [ID#68360] Info: Closing connection 26/02/2024 10:14:55 | climateprediction.net | [http] HTTP error: Timeout was reached 26/02/2024 10:14:55 | climateprediction.net | [http] HTTP error: Timeout was reached 26/02/2024 10:14:55 | climateprediction.net | Backing off 03:19:29 on upload of wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_10.zip 26/02/2024 10:14:55 | climateprediction.net | Backing off 05:00:23 on upload of wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_11.zip |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
I will email Andy again. Done. I don't recall getting an answer last time so perhaps he missed it. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Andy tells me there have been issues with the Jasmine server. Please let me know if these persist. He didn't say explicitly that they have been resolved however. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,490,541 RAC: 15,784 |
Yes hardware issues at the JASMIN upload have been resolved according to their last email. --- CPDN Visiting Scientist |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Yes hardware issues at the JASMIN upload have been resolved according to their last email. I still get the same message from upload 11 as pasted bellow. Transfers are backing off. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Hi there, I'm still having problems uploading the NZ unit 06/03/2024 18:03:43 | | [http] [ID#0] Info: Connection 1680 seems to be dead 06/03/2024 18:03:43 | | [http] [ID#0] Info: Closing connection 06/03/2024 18:03:43 | | [http] [ID#0] Info: Trying 142.250.184.132:443... 06/03/2024 18:03:43 | | [http] [ID#0] Info: Connected to www.google.com (142.250.184.132) port 443 06/03/2024 18:03:43 | | [http] [ID#0] Info: schannel: disabled automatic use of client certificate 06/03/2024 18:03:43 | | [http] [ID#0] Info: ALPN: offers http/1.1 06/03/2024 18:03:43 | | [http] [ID#0] Info: ALPN: server accepted http/1.1 06/03/2024 18:03:43 | | [http] [ID#0] Info: using HTTP/1.1 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: GET / HTTP/1.1 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: Host: www.google.com 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.24.1) 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: Accept: */* 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: Accept-Encoding: deflate, gzip 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: Accept-Language: en_GB 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: roject_name> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <name>wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_1.zip</name> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <nbytes>90462248.000000</nbytes> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <max_nbytes>150000000.000000</max_nbytes> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <status>1</status> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <persistent_file_xfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <num_retries>107</num_retries> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <first_request_time>1707778812.154770</first_request_time> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <next_request_time>0.000000</next_request_time> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <time_so_far>2397.318125</time_so_far> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <last_bytes_xferred>0.000000</last_bytes_xferred> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <is_upload>1</is_upload> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: </persistent_file_xfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: </file_transfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <file_transfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <project_url>https://climateprediction.net/</project_url> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <project_name>climateprediction.net</project_name> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <name>wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_2.zip</name> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <nbytes>90443142.000000</nbytes> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <max_nbytes>150000000.000000</max_nbytes> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <status>1</status> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <persistent_file_xfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <num_retries>100</num_retries> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <first_request_time>1707820366.499971</first_request_time> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <next_request_time>1709754754.739190</next_request_time> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <time_so_far>2214.724501</time_so_far> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <last_bytes_xferred>0.000000</last_bytes_xferred> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <is_upload>1</is_upload> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: </persistent_file_xfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: </file_transfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <file_transfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <project_url>https://climateprediction.net/</project_url> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <project_name>climateprediction.net</project_name> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <name>wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_3.zip</name> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <nbytes>90381653.000000</nbytes> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <max_nbytes>150000000.000000</max_nbytes> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <status>1</status> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <persistent_file_xfer> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <num_retries>95</num_retries> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <first_request_time>1707862198.360235</first_request_time> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <next_request_time>1709741916.170670</next_request_time> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <time_so_far>2150.490557</time_so_far> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: <last_bytes_xferred>0.000000</last_bytes_xferred> 06/03/2024 18:03:43 | | [http] [ID#0] Sent header to server: 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: HTTP/1.1 200 OK 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Date: Wed, 06 Mar 2024 16:03:43 GMT 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Expires: -1 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Cache-Control: private, max-age=0 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Content-Type: text/html; charset=ISO-8859-1 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Content-Security-Policy-Report-Only: object-src 'none';base-uri 'self';script-src 'nonce-iluVkURiYRTuWjkJQq8qLA' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info." 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Content-Encoding: gzip 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Server: gws 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: X-XSS-Protection: 0 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: X-Frame-Options: SAMEORIGIN 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Set-Cookie: SOCS=CAAaBgiA5J6vBg; expires=Sat, 05-Apr-2025 16:03:43 GMT; path=/; domain=.google.com; Secure; SameSite=lax 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Set-Cookie: AEC=Ae3NU9OtMv-AT4puuLtG9-hDGjFWyvGkxmVyNbtg8lB2WvQe8RS7H3wrJVU; expires=Mon, 02-Sep-2024 16:03:43 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Set-Cookie: __Secure-ENID=18.SE=PWV92Q812eZlxzZwIJlCCNHWpH-aP7jKE_OSUX6EKvj4cbJdUJhhOwU0DallQGiVyf8L5NElF8n0FwG6HNWuKrCeu3jBsS6HmUrMtXXsWK_BD-DSoida2VoczQev5-gCtQ3yY5f4GiKIOcNcZTAxqEalZ0rnULkORie0g3XHpCUj3CSO-w; expires=Sun, 06-Apr-2025 08:22:01 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: Transfer-Encoding: chunked 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: 00000001 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: 01 06/03/2024 18:03:43 | | 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: 12b3 06/03/2024 18:03:43 | | [http] [ID#0] Received header from server: 06/03/2024 18:03:43 | | [http] [ID#0] Info: Connection #1683 to host www.google.com left intact 06/03/2024 18:03:44 | | Internet access OK - project servers may be temporarily down. 06/03/2024 18:03:44 | climateprediction.net | [http] [ID#9812] Info: processing: http://upload11.cpdn.org/cgi-bin/file_upload_handler 06/03/2024 18:03:44 | climateprediction.net | [http] [ID#9812] Info: Hostname upload11.cpdn.org was found in DNS cache 06/03/2024 18:03:44 | climateprediction.net | [http] [ID#9812] Info: Trying 192.171.169.187:80... 06/03/2024 18:04:05 | climateprediction.net | [http] [ID#9812] Info: connect to 192.171.169.187 port 80 failed: Timed out 06/03/2024 18:04:05 | climateprediction.net | [http] [ID#9812] Info: Failed to connect to upload11.cpdn.org port 80 after 21073 ms: Couldn't connect to server 06/03/2024 18:04:05 | climateprediction.net | [http] [ID#9812] Info: Closing connection 06/03/2024 18:04:05 | climateprediction.net | [http] HTTP error: Timeout was reached 06/03/2024 18:04:05 | climateprediction.net | Backing off 03:20:14 on upload of wah2_nz25_n0a5_198805_25_1005_012254523_1_r1351571623_1.zip 06/03/2024 18:04:06 | | Project communication failed: attempting access to reference site Any suggestions? Should I abort? I have 17 zips so far, and the WU is paused |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Any suggestions? Should I abort? I have 17 zips so far, and the WU is paused I will send another email Andy's way. Done. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,038,916 RAC: 14,611 |
Mine seem to be going OK:- 06/03/2024 21:08:20 | climateprediction.net | Started upload of wah2_nz25_n316_201205_25_1005_012258088_0_r1313910418_19.zip 06/03/2024 21:08:20 | climateprediction.net | [file_xfer] URL: http://upload11.cpdn.org/cgi-bin/file_upload_handler 06/03/2024 21:08:22 | climateprediction.net | [file_xfer] http op done; retval 0 (Success) 06/03/2024 21:08:22 | climateprediction.net | [file_xfer] parsing upload response: <data_server_reply> <status>0</status> <file_size>0</file_size></data_server_reply> 06/03/2024 21:08:22 | climateprediction.net | [file_xfer] parsing status: 0 06/03/2024 21:08:22 | climateprediction.net | [fxd] starting upload, upload_offset 0 06/03/2024 21:08:57 | climateprediction.net | [file_xfer] http op done; retval 0 (Success) 06/03/2024 21:08:57 | climateprediction.net | [file_xfer] parsing upload response: <data_server_reply> <status>0</status></data_server_reply> 06/03/2024 21:08:57 | climateprediction.net | [file_xfer] parsing status: 0 06/03/2024 21:08:57 | climateprediction.net | [file_xfer] file transfer status 0 (Success) 06/03/2024 21:08:57 | climateprediction.net | Finished upload of wah2_nz25_n316_201205_25_1005_012258088_0_r1313910418_19.zip (90557393 bytes) 06/03/2024 21:08:57 | climateprediction.net | [file_xfer] Throughput 2577771 bytes/sec |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Mine seem to be going OK:- Not for me though. The same message today. I wonder if i can turn on any other log option to gather more info? |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,490,541 RAC: 15,784 |
There are currently no incidents reported on the JASMIN/CEDA status page, which is where upload11.cpdn.org really is: https://www.ceda.ac.uk/status/ upload11.cpdn.org resolves ok for me: 192.171.169.187. It's either a local DNS issue or something BOINC related. --- CPDN Visiting Scientist |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
There are currently no incidents reported on the JASMIN/CEDA status page, which is where upload11.cpdn.org really is: Ping and tracert to upload11.cpdn.org, cpdn.org and climateprediction.net timed out at the first hop at the gateway internal IP. Tracing route to upload11.cpdn.org [192.171.169.187] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 192.168.1.1 2 * * * Request timed out. 3 * * * Request timed out. 4 * * * Request timed out. 5 * * * Request timed out. 6 * * * Request timed out. I can ping and tracert other sites, though with some hops=Request time out after first external server of the ISP - 82-137-110-2.ip.btc-net.bg [82.137.110.2] . Not sure what to do. Maybe call the ISP and check why can't i see the server C:\Users\10>tracert google.com Tracing route to google.com [216.58.212.14] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 192.168.1.1 2 1 ms <1 ms 1 ms 82-137-110-2.ip.btc-net.bg [82.137.110.2] 3 * * * Request timed out. 4 * * * Request timed out. 5 1 ms 1 ms 1 ms 212-39-66-222.ip.btc-net.bg [212.39.66.222] 6 1 ms <1 ms <1 ms 142.251.243.215 7 1 ms <1 ms <1 ms 142.250.60.19 8 <1 ms <1 ms <1 ms sof04s01-in-f14.1e100.net [216.58.212.14] Trace complete |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,490,541 RAC: 15,784 |
Don't rely on ping & traceroute to diagnose network problems. These packets are often silently dropped by firewalls around large hosts like JASMIN/CEDA. --- CPDN Visiting Scientist |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Don't rely on ping & traceroute to diagnose network problems. These packets are often silently dropped by firewalls around large hosts like JASMIN/CEDA. That would explain why it only works for me some of the time. |
©2024 cpdn.org