climateprediction.net home page
Connection and Download issues Oct24

Connection and Download issues Oct24

Message boards : Number crunching : Connection and Download issues Oct24
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1048
Credit: 16,390,249
RAC: 15,269
Message 71741 - Posted: 31 Oct 2024, 11:43:25 UTC - in response to Message 71738.  

New thread to clear the announcement thread.

Thanks Dave.

In summary, the 'http internal' error is a consequence of the URL changes that climateprediction.net have been tidying up on the Oxford side - a leftover from the 'rebranding' attempt. Andy is aware of the problem and is going to revert changes back to where they were before the problems started.

The download issues for the latest batch is a separate (maybe related) problem. I've reported it to Andy and await further news. For now I'm going to keep the batch open but we are now seeing some hard failures (all 3 tasks attempts fail) due to download issues.
---
CPDN Visiting Scientist
ID: 71741 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1048
Credit: 16,390,249
RAC: 15,269
Message 71743 - Posted: 31 Oct 2024, 11:48:09 UTC

I've just had an email from Andy that he has now pointed 'www.cpdn.org' back to the project server (as it was before last Friday). It will take time for the change to trickle down to hosts. If things are still going wrong tomorrow, we might have another issue.
---
CPDN Visiting Scientist
ID: 71743 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,690,033
RAC: 10,812
Message 71744 - Posted: 31 Oct 2024, 12:04:34 UTC - in response to Message 71743.  
Last modified: 31 Oct 2024, 12:15:48 UTC

Just discovered that it I stop and restart BOINC v7.24.1, I get

31/10/2024 11:53:07 | climateprediction.net | Resetting file projects/climateprediction.net/wah2_8.24_windows_intelx86.exe: permanent HTTP error
31/10/2024 11:53:07 | climateprediction.net | Resetting file projects/climateprediction.net/wah2_data_8.24_windows_intelx86.zip: permanent HTTP error
31/10/2024 11:53:07 | climateprediction.net | Resetting file projects/climateprediction.net/wah2_se_8.24_windows_intelx86.zip: permanent HTTP error
31/10/2024 11:53:07 | climateprediction.net | Resetting file projects/climateprediction.net/wah2am3m2_um_8.24_windows_intelx86.zip: permanent HTTP error
31/10/2024 11:53:07 | climateprediction.net | Resetting file projects/climateprediction.net/wah2rm3m2t_um_8.24_windows_intelx86.zip: permanent HTTP error
and it retries the downloads again! That sounds like a bug, but a helpful one in this case - I don't have to wait an hour, and I don't waste another task on testing.

Poring through a log now...

Edit: The key lines seem to be:

31/10/2024 11:53:08 | climateprediction.net | [http] HTTP_OP::init_get(): https://www.cpdn.org/cpdnboinc/applications//wah2_8.24_windows_intelx86.exe
31/10/2024 11:53:08 | climateprediction.net | [http] [ID#5] Sent header to server: GET /cpdnboinc/applications/wah2_8.24_windows_intelx86.exe HTTP/1.1
31/10/2024 11:53:08 | climateprediction.net | [http] [ID#5] Sent header to server: Referer: https://www.cpdn.org/cpdnboinc/applications//wah2_8.24_windows_intelx86.exe
31/10/2024 11:53:08 | climateprediction.net | [http] [ID#5] Received header from server: HTTP/1.1 404 Not Found
I'll keep trying.
ID: 71744 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1048
Credit: 16,390,249
RAC: 15,269
Message 71745 - Posted: 31 Oct 2024, 12:21:54 UTC - in response to Message 71744.  

If I cut'n'paste the failing URL: https://www.cpdn.org/cpdnboinc/applications//wah2_8.24_windows_intelx86.exe into a new tab, I get:

This server could not prove that it is www.cpdn.org; its security certificate is from main.cpdn.org. This may be caused by a misconfiguration or an attacker intercepting your connection.

Sigh.. another email to Andy.
---
CPDN Visiting Scientist
ID: 71745 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,690,033
RAC: 10,812
Message 71746 - Posted: 31 Oct 2024, 12:22:22 UTC - in response to Message 71744.  

Now it's changed to:

31/10/2024 12:16:46 | climateprediction.net | [http] [ID#5] Info:  Connected to www.cpdn.org (129.67.193.106) port 443
31/10/2024 12:16:46 | climateprediction.net | [http] [ID#5] Info:  schannel: SNI or certificate check failed: SEC_E_WRONG_PRINCIPAL (0x80090322) - The target principal name is incorrect.
31/10/2024 12:16:46 | climateprediction.net | [http] [ID#5] Info:  schannel: shutting down SSL/TLS connection with www.cpdn.org port 443
31/10/2024 12:16:46 | climateprediction.net | [http] HTTP error: SSL peer certificate or SSH remote key was not OK
31/10/2024 12:16:46 | climateprediction.net | Temporarily failed download of wah2_8.24_windows_intelx86.exe: transient HTTP error
ID: 71746 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,962,600
RAC: 21,639
Message 71747 - Posted: 31 Oct 2024, 12:47:16 UTC
Last modified: 31 Oct 2024, 14:20:36 UTC

If I cut'n'paste the failing URL: https://www.cpdn.org/cpdnboinc/applications//wah2_8.24_windows_intelx86.exe into a new tab, I get:

This server could not prove that it is www.cpdn.org; its security certificate is from main.cpdn.org. This may be caused by a misconfiguration or an attacker intercepting your connection.


Having accepted the risk and continued I have downloaded the files and placed them in the relevant directory. Will try getting tasks again when back off period complete.
Edit: Aargh! Now I am up against having reached my daily quota! I guess this means all the tasks will be gone by the time I can try again.
ID: 71747 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,690,033
RAC: 10,812
Message 71748 - Posted: 31 Oct 2024, 14:25:21 UTC

Still getting

31/10/2024 14:16:52 | climateprediction.net | [http] [ID#7] Info:  schannel: SNI or certificate check failed: SEC_E_WRONG_PRINCIPAL (0x80090322) - The target principal name is incorrect.
on those non-permanent app downloads. I'll let it keep trying, and then try a fetch once I have the apps.
ID: 71748 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,962,600
RAC: 21,639
Message 71749 - Posted: 31 Oct 2024, 14:26:32 UTC
Last modified: 31 Oct 2024, 14:35:11 UTC

Now getting, "Internet servers may temporarily be down." (8.0.4)
ID: 71749 · Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 29 Nov 17
Posts: 82
Credit: 14,292,064
RAC: 92,259
Message 71750 - Posted: 31 Oct 2024, 14:28:41 UTC

On an older client I'm now getting:

31/10/2024 13:49:56 | climateprediction.net | Scheduler request failed: SSL peer certificate or SSH remote key was not OK
ID: 71750 · Report as offensive     Reply Quote
[AF] Kalianthys

Send message
Joined: 20 Dec 20
Posts: 13
Credit: 40,040,065
RAC: 10,306
Message 71751 - Posted: 31 Oct 2024, 14:31:56 UTC - in response to Message 71750.  

On an older client I'm now getting:

31/10/2024 13:49:56 | climateprediction.net | Scheduler request failed: SSL peer certificate or SSH remote key was not OK


Same message for me.


Kali.
ID: 71751 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,962,600
RAC: 21,639
Message 71752 - Posted: 31 Oct 2024, 15:00:35 UTC
Last modified: 31 Oct 2024, 15:08:05 UTC

Now have one task running 8.0.4 client and manager running under WINE. The same client iteration in Win10 in a VM still says, "Internet servers may be temporarily down."
Edit:I don't know whether downloading the executables manually did the trick or whether Andy has it sorted enough that I didn't need to! And I still don't understand why the WINE client should work and the Win10/VM one doesn't!
ID: 71752 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,690,033
RAC: 10,812
Message 71753 - Posted: 31 Oct 2024, 15:09:46 UTC - in response to Message 71752.  

The same client iteration in Win10 in a VM still says, "Internet servers may be temporarily down."
That sounds more like the message you get from the 'reference site' (google.com) when the BOINC client wants to check if a problem is project-specific or global.

That can be kicked out of the way with

<dont_contact_ref_site>1</dont_contact_ref_site>
in cc_config.xml
ID: 71753 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,962,600
RAC: 21,639
Message 71754 - Posted: 31 Oct 2024, 15:19:01 UTC

Thanks Richard, I hadn't discovered that little gem. Still doesn't explain the difference between using WINE and a VM when the client and manager versions are the same.

I fooled the daily task limit by increasing the number of cores BOINC could use.
ID: 71754 · Report as offensive     Reply Quote
rob

Send message
Joined: 5 Jun 09
Posts: 97
Credit: 3,731,885
RAC: 4,631
Message 71756 - Posted: 31 Oct 2024, 15:59:59 UTC

It would appear that there are two problems:
First, and most obvious to many is the failure of downloads and uploads
Second is Trickles not being logged correctly, for example one of my running tasks https://main.cpdn.org/result.php?resultid=22520635 did a tickle upload at 14:08 today:
31/10/2024 14:50:42 | climateprediction.net | Finished upload of wah2_eas25_20vn_209112_24_1026_012333978_2_r901766137_16.zip (99883726 bytes)

But this tickle isn't showing up on the task's list of tickles - the 14:08 tickle should be the only one.
The only tickles that appear to have been logged properly are for tasks that already had recorded and logged tickles.
ID: 71756 · Report as offensive     Reply Quote
TLD

Send message
Joined: 11 Dec 05
Posts: 14
Credit: 2,170,459
RAC: 7,099
Message 71758 - Posted: 31 Oct 2024, 16:56:34 UTC

I don't appear to be able to connect to the project at all today.

17237 climateprediction.net 10/31/2024 9:26:49 AM Sending scheduler request: Requested by user.
17238 climateprediction.net 10/31/2024 9:26:49 AM Requesting new tasks for CPU
17239 10/31/2024 9:26:50 AM Project communication failed: attempting access to reference site
17240 climateprediction.net 10/31/2024 9:26:50 AM Scheduler request to https://www.cpdn.org/cpdnboinc_cgi/cgi failed: SSL peer certificate or SSH remote key was not OK
17241 10/31/2024 9:26:51 AM Internet access OK - project servers may be temporarily down.
ID: 71758 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,962,600
RAC: 21,639
Message 71759 - Posted: 31 Oct 2024, 17:15:32 UTC - in response to Message 71758.  

Getting that with my VM installation. But using Wine and the same client, 8.0.4 things now seem to be working.
ID: 71759 · Report as offensive     Reply Quote
TLD

Send message
Joined: 11 Dec 05
Posts: 14
Credit: 2,170,459
RAC: 7,099
Message 71760 - Posted: 31 Oct 2024, 17:23:05 UTC - in response to Message 71759.  

I've got windows 11 with boinc 8.0.2 on bear metal, should i upgrade to boinc 8.0.4?
ID: 71760 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1048
Credit: 16,390,249
RAC: 15,269
Message 71761 - Posted: 31 Oct 2024, 17:32:39 UTC - in response to Message 71760.  
Last modified: 31 Oct 2024, 17:32:55 UTC

I've asked Andy about this and he says this will settle down once the DNS changes made today to fix the problem take effect and trickle down to hosts.
---
CPDN Visiting Scientist
ID: 71761 · Report as offensive     Reply Quote
TLD

Send message
Joined: 11 Dec 05
Posts: 14
Credit: 2,170,459
RAC: 7,099
Message 71762 - Posted: 31 Oct 2024, 17:43:38 UTC - in response to Message 71761.  

I've upgraded 1 machine to BOINC 8.0.4 and it does connect but-

57 climateprediction.net 10/31/2024 10:30:46 AM Resetting file projects/climateprediction.net/wah2_8.24_windows_intelx86.exe: permanent HTTP error
58 climateprediction.net 10/31/2024 10:30:46 AM Resetting file projects/climateprediction.net/wah2_data_8.24_windows_intelx86.zip: permanent HTTP error
59 climateprediction.net 10/31/2024 10:30:46 AM Resetting file projects/climateprediction.net/wah2_se_8.24_windows_intelx86.zip: permanent HTTP error
60 climateprediction.net 10/31/2024 10:30:46 AM Resetting file projects/climateprediction.net/wah2am3m2_um_8.24_windows_intelx86.zip: permanent HTTP error
61 climateprediction.net 10/31/2024 10:30:46 AM Resetting file projects/climateprediction.net/wah2rm3m2t_um_8.24_windows_intelx86.zip: permanent HTTP error
62 10/31/2024 10:30:46 AM Running CPU benchmarks
63 10/31/2024 10:30:46 AM Suspending computation - CPU benchmarks in progress
64 10/31/2024 10:31:17 AM Benchmark results:
65 10/31/2024 10:31:17 AM Number of CPUs: 16
66 10/31/2024 10:31:17 AM 5703 floating point MIPS (Whetstone) per CPU
67 10/31/2024 10:31:17 AM 21398 integer MIPS (Dhrystone) per CPU
68 World Community Grid 10/31/2024 10:31:18 AM Started upload of MCM1_0227343_0182_0_r1176294313_0
69 climateprediction.net 10/31/2024 10:31:18 AM Started download of wah2_8.24_windows_intelx86.exe
70 climateprediction.net 10/31/2024 10:31:18 AM Started download of wah2_data_8.24_windows_intelx86.zip
71 World Community Grid 10/31/2024 10:31:19 AM Finished upload of MCM1_0227343_0182_0_r1176294313_0 (895 bytes)
72 climateprediction.net 10/31/2024 10:31:19 AM Temporarily failed download of wah2_8.24_windows_intelx86.exe: transient HTTP error
73 climateprediction.net 10/31/2024 10:31:19 AM Backing off 00:02:25 on download of wah2_8.24_windows_intelx86.exe
74 climateprediction.net 10/31/2024 10:31:19 AM Temporarily failed download of wah2_data_8.24_windows_intelx86.zip: transient HTTP error
75 climateprediction.net 10/31/2024 10:31:19 AM Backing off 00:02:54 on download of wah2_data_8.24_windows_intelx86.zip
76 climateprediction.net 10/31/2024 10:31:19 AM Started download of wah2_se_8.24_windows_intelx86.zip
77 climateprediction.net 10/31/2024 10:31:19 AM Started download of wah2am3m2_um_8.24_windows_intelx86.zip
78 climateprediction.net 10/31/2024 10:31:20 AM Temporarily failed download of wah2_se_8.24_windows_intelx86.zip: transient HTTP error
79 climateprediction.net 10/31/2024 10:31:20 AM Backing off 00:03:17 on download of wah2_se_8.24_windows_intelx86.zip
80 climateprediction.net 10/31/2024 10:31:20 AM Temporarily failed download of wah2am3m2_um_8.24_windows_intelx86.zip: transient HTTP error
81 climateprediction.net 10/31/2024 10:31:20 AM Backing off 00:02:49 on download of wah2am3m2_um_8.24_windows_intelx86.zip
82 climateprediction.net 10/31/2024 10:31:20 AM Started download of wah2rm3m2t_um_8.24_windows_intelx86.zip
83 10/31/2024 10:31:21 AM Project communication failed: attempting access to reference site
84 climateprediction.net 10/31/2024 10:31:21 AM Temporarily failed download of wah2rm3m2t_um_8.24_windows_intelx86.zip: transient HTTP error
85 climateprediction.net 10/31/2024 10:31:21 AM Backing off 00:03:27 on download of wah2rm3m2t_um_8.24_windows_intelx86.zip
ID: 71762 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,690,033
RAC: 10,812
Message 71763 - Posted: 31 Oct 2024, 18:47:22 UTC - in response to Message 71756.  

It would appear that there are two problems:
First, and most obvious to many is the failure of downloads and uploads
Second is Trickles not being logged correctly, for example one of my running tasks https://main.cpdn.org/result.php?resultid=22520635 did a tickle upload at 14:08 today:
31/10/2024 14:50:42 | climateprediction.net | Finished upload of wah2_eas25_20vn_209112_24_1026_012333978_2_r901766137_16.zip (99883726 bytes)
But this tickle isn't showing up on the task's list of tickles - the 14:08 tickle should be the only one.
The only tickles that appear to have been logged properly are for tasks that already had recorded and logged tickles.
You're right, but you're looking at the wrong part of the log. The trickle error is:

31/10/2024 18:31:53 | climateprediction.net | [sched_op] Starting scheduler request
31/10/2024 18:31:54 | climateprediction.net | Sending scheduler request: To send trickle-up message.
31/10/2024 18:31:54 | climateprediction.net | Not requesting tasks: don't need (CPU: max concurrent job limit; NVIDIA GPU: no applications; Intel GPU: no applications)
31/10/2024 18:31:54 | climateprediction.net | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
31/10/2024 18:31:54 | climateprediction.net | [sched_op] NVIDIA GPU work request: 0.00 seconds; 0.00 devices
31/10/2024 18:31:54 | climateprediction.net | [sched_op] Intel GPU work request: 0.00 seconds; 0.00 devices
31/10/2024 18:31:55 | climateprediction.net | Scheduler request to https://www.cpdn.org/cpdnboinc_cgi/cgi failed: SSL peer certificate or SSH remote key was not OK
31/10/2024 18:31:55 | climateprediction.net | [sched_op] Deferring communication for 01:10:58
31/10/2024 18:31:55 | climateprediction.net | [sched_op] Reason: Scheduler request failed
The uploads are going through OK:

31/10/2024 17:07:01 | climateprediction.net | Started upload of wah2_nz25_11y8_209705_25_1028_012346455_0_r970510034_3.zip
31/10/2024 17:08:23 | climateprediction.net | Finished upload of wah2_nz25_11y8_209705_25_1028_012346455_0_r970510034_3.zip (90515142 bytes)
The timings aren't comparable, because the trickles are given the scheduler backoff of around an hour, and then retried.

Those were both from task 22523488: one of the new batch, picked up just before midnight last night. It's showing a reported trickle at 10:52 UTC today - there should have been another one around tea-time, but it got stuck.
ID: 71763 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Connection and Download issues Oct24

©2024 cpdn.org