climateprediction.net (CPDN) home page
Thread 'Unable to connect to server to get tasks'

Thread 'Unable to connect to server to get tasks'

Message boards : Number crunching : Unable to connect to server to get tasks
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4537
Credit: 19,001,532
RAC: 21,726
Message 71335 - Posted: 21 Aug 2024, 9:17:10 UTC
Last modified: 21 Aug 2024, 11:33:39 UTC

Skillz has seen a jump in RAC from 0.to 1,824.72 I presume this means something has worked.

Edit: And has increased further since I spotted it. Confirms I was not imagining it!
ID: 71335 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 32
Credit: 40,970,218
RAC: 78,855
Message 71338 - Posted: 21 Aug 2024, 15:46:36 UTC

I'm not sure if that's actually Skillz RAC from his computers. He had a team member try his weak key to see if it worked and it did. That other person using the weak key may still be running the tasks. I'm pretty sure Skillz still can't connect to the server. I'll see if I can find out.
ID: 71338 · Report as offensive     Reply Quote
Skillz

Send message
Joined: 4 Jun 17
Posts: 15
Credit: 2,654,269
RAC: 1,635
Message 71339 - Posted: 21 Aug 2024, 16:00:20 UTC

Someone else on my team attached to the project using my weak key.

I can only attach the project with boinccmd, like I said already. BOINC manager just reports failure.

I've tried tethering my phone to my computer to attempt to connect as well with the same outcome.
ID: 71339 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 317
Credit: 14,816,935
RAC: 19,934
Message 71341 - Posted: 21 Aug 2024, 20:33:39 UTC - in response to Message 71339.  

Someone else on my team attached to the project using my weak key.

I'm assuming that means you tried attaching the project using the key yourself but it didn't work. What were the errors doing it that way?

I can only attach the project with boinccmd, like I said already. BOINC manager just reports failure.

I thought you said that command line didn't work either? Otherwise it'd seem like you should be able to run the project.
ID: 71341 · Report as offensive     Reply Quote
Skillz

Send message
Joined: 4 Jun 17
Posts: 15
Credit: 2,654,269
RAC: 1,635
Message 71343 - Posted: 22 Aug 2024, 0:06:44 UTC - in response to Message 71341.  

The error is just failed to connect when using BOINC Manager or BOINCTasks. So the task is never shown in the "project list" on either account.

If I use boinccmd to attach the project it will show in the "project list" on the host; but it wont get any work. Errors are above when trying to ask the server for work.
ID: 71343 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4537
Credit: 19,001,532
RAC: 21,726
Message 71344 - Posted: 22 Aug 2024, 9:18:10 UTC

I remember a year or two back, George had problems attaching a Linux box to the project but at some point with nothing changed, it worked. I couldn't find any posts about it though so it may have just been discussed on the moderators' list.
ID: 71344 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 71345 - Posted: 22 Aug 2024, 10:23:58 UTC - in response to Message 71344.  

We've also seen it and discussed it on this board before. Try:

Sign Up broken (last year)
'Attach to project' process for new users broken. (2021 - but some messages seem to be missing)
ID: 71345 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4537
Credit: 19,001,532
RAC: 21,726
Message 71346 - Posted: 22 Aug 2024, 10:49:58 UTC - in response to Message 71345.  
Last modified: 22 Aug 2024, 10:50:34 UTC

We've also seen it and discussed it on this board before. Try:

Sign Up broken (last year)
'Attach to project' process for new users broken. (2021 - but some messages seem to be missing)

Just trawled through the two threads you linked to Richard without finding anything that sheds light on what is causing the current issue.

I am really struggling to understand what could be different about Skillz' computers that is causing this.
ID: 71346 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 71347 - Posted: 22 Aug 2024, 10:58:30 UTC - in response to Message 71346.  

Sire. But they perhaps explain why I was wanting to consider the possibility of a server-side explanation. When you've eliminated the impossible, anything else ...
ID: 71347 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4537
Credit: 19,001,532
RAC: 21,726
Message 71348 - Posted: 22 Aug 2024, 13:55:32 UTC - in response to Message 71347.  

Sire. But they perhaps explain why I was wanting to consider the possibility of a server-side explanation. When you've eliminated the impossible, anything else ...
I get that but assuming it is a server issue, there is still the question of why some computers and not others. I assume it is not some hidden code that picks a computer at random and decides it doesn't like it! I think I need to follow the example of the red queen and believe six impossible things before breakfast.
ID: 71348 · Report as offensive     Reply Quote
alanb1951

Send message
Joined: 31 Aug 04
Posts: 37
Credit: 9,581,380
RAC: 3,853
Message 71349 - Posted: 22 Aug 2024, 19:33:29 UTC

Just a thought from some issues over at MilkyWay a while back... Could it be that Skillz has an out-of-date or damaged certificates file on the offending system?

If someone else has already suggested this and I missed it, my apologies; it may be a red herring anyway :-)

Cheers - Al.
ID: 71349 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 71350 - Posted: 22 Aug 2024, 20:02:18 UTC - in response to Message 71349.  
Last modified: 22 Aug 2024, 20:02:31 UTC

Yes, I wondered that and suggested looking at Windows credentials some messages back. Though I would think the route via boinccmd would use the same.

Since it's the GUI side of the boinc client that's not working, maybe there's an issue with the client config?
ID: 71350 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 71351 - Posted: 22 Aug 2024, 21:09:18 UTC - in response to Message 71350.  
Last modified: 22 Aug 2024, 21:18:50 UTC

The problem isn't with the initial contact ('get project configuration'): it's with the subsequent fetch of the scheduler list - message 71343. That's done by the client, whether triggered by the Manager (GUI), or by the command line (boinccmd).

Edit: I saw three examples (out of 50 tasks) of that during my look at recent attachments. All were ARM processors: two were Linux (1552459 and 1552468), and one Android (1552499). A host record had been created, but no subsequent scheduler contact had taken place.
ID: 71351 · Report as offensive     Reply Quote
Skillz

Send message
Joined: 4 Jun 17
Posts: 15
Credit: 2,654,269
RAC: 1,635
Message 71352 - Posted: 22 Aug 2024, 21:56:57 UTC - in response to Message 71350.  

Yes, I wondered that and suggested looking at Windows credentials some messages back. Though I would think the route via boinccmd would use the same.

Since it's the GUI side of the boinc client that's not working, maybe there's an issue with the client config?


I think the difference is with how the GUI attaches hosts.

Using the GUI (BOINC Manager or BOINCTasks) you use your email and password so it has to communicate with the project servers to ensure those credentials are correct. Since it can't connect it just reports failed to communicate and nothing happens.

When using boinccmd and your weak key (or account key) no communication is needed to create the host account file. It's generated using the account key (weak or otherwise) for you so it shows up in your "project list" as project communications isn't needed for this.

Both results end up with the host not able to communicate with the project and thus unable to retrieve any work. Why it's not communicating is the issue and I don't know why.

As I've mentioned before. I've tried multiple Windows hosts (3 of them to be exact, all three run other projects just fine as well)
I've even tried a different network.
1. My home network with my home internet connection.
2. My cell phone connection connected to the carrier's cell towers for internet tethered via Wifi to one of the hosts with the host completely disconnected from the home network.
ID: 71352 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 71353 - Posted: 24 Aug 2024, 13:58:50 UTC - in response to Message 71352.  

Ok. I have access to the CPDN databases and logs. If someone who knows where (and what) to look at can advise, I can do more investigating. I'm assuming there must be some error record at the CPDN side somewhere. But I'm not familiar with the BOINC server-side so appreciate some pointers.
---
CPDN Visiting Scientist
ID: 71353 · Report as offensive     Reply Quote
Skillz

Send message
Joined: 4 Jun 17
Posts: 15
Credit: 2,654,269
RAC: 1,635
Message 71361 - Posted: 27 Aug 2024, 0:50:52 UTC

I never replied because I don't know, but figured I should probably reply letting you or whoever know. I've never handled the server side of BOINC so I don't know much about it.
ID: 71361 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4537
Credit: 19,001,532
RAC: 21,726
Message 71362 - Posted: 27 Aug 2024, 7:13:20 UTC

Might be worth asking over on the BOINC fora. At the least they might be able to point you in the direction of which other projects have input on their boards from those with such knowledge.
ID: 71362 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 71363 - Posted: 27 Aug 2024, 7:26:47 UTC

Given that the error message we noted applied to establishing an SSL / HTTPS connection on port 443, any clues are likely to be found in the Apache logs, rather than BOINC. I don't think any part of BOINC has been reached yet.
ID: 71363 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4537
Credit: 19,001,532
RAC: 21,726
Message 71364 - Posted: 27 Aug 2024, 7:46:13 UTC - in response to Message 71363.  

I wonder how long Apache logs are kept for. With the number of machines contacting CPDN it must generate a lot of data. Might need to get Skillz to try again and give you the time to look at if the data from the original posts has been overwritten/deleted.
ID: 71364 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 71365 - Posted: 27 Aug 2024, 8:07:12 UTC - in response to Message 71364.  

And Skillz say he's in the US - it would be helpful if he could say which time zone he's in, to aid translation from BOINC (local) time to UTC.
ID: 71365 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Unable to connect to server to get tasks

©2024 cpdn.org