climateprediction.net (CPDN) home page
Thread 'Site problems'

Thread 'Site problems'

Message boards : Number crunching : Site problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64570 - Posted: 3 Oct 2021, 6:50:08 UTC - in response to Message 64568.  

Looks like file names may be slightly different in Red Hat for the record the ca-bundle.crt on Ubuntu is by default in /var/lib/boinc-client/


For what it's worth, on Ubuntu that entry in boinc-client is a link to /etc/ssl/certs/ca-certificates.crt, which gets updated as necessary - I would expect most Linux distributions that have a properly curated BOINC client package to do something similar, but I could be wrong :-)

And I note that we still have the (internal to CPDN) PHP warnings at the top of every page, so it looks like Oxford have some sorting out to do anyway...

Cheers - Al.

[Edit...] P.S. The certificate bundle was most recently updated on 28th September, 2021, round about when this root certificate expired.


This is very interesting. My RHEL 8.4 release of Linux seems somewhat different from the one on Ubuntu. They agree on where the Boinc client stuff goes (/var/lib/boinc except Ubuntu calls it boinc-client. In RHEL8..4, the actual programs are in /usr/bin

[/usr/bin]$ ls -l boinc*
-rwxr-xr-x. 2 root root 1112736 Nov 11  2020 boinc
-rwxr-xr-x. 2 root root 1112736 Nov 11  2020 boinc_client
-rwxr-xr-x. 1 root root  359344 Nov 11  2020 boinccmd
-rwxr-xr-x. 1 root root 4078376 Nov 11  2020 boincmgr
-rwxr-xr-x. 1 root root  351128 Nov 11  2020 boincscr


Note that for some reason, both boinc and boinc_client are the same:
[/usr/bin]$ cmp boinc boinc_client
[/usr/bin]$

The following are the main differences between RHEL8.4 linux and alanb1951's version of Ubuntu. I do not think these differences are of much significance.

On RHEL8.4, $ locate ca-bundle.crt
(appears only in)
/etc/pki/tls/certs/ca-bundle.crt
[/usr/bin]$

There is no link in /var/lib/boinc (boinc-client) to /etc/ssl/certs/ca-certificates.crt. And the file itself is in a slightly different directory:
/etc/pki/tls/certs instead of /etc/ssl/certs, although there is a link from there to the directory where the file actually is:

[/etc/ssl]$ ls -l
total 0
lrwxrwxrwx. 1 root root 16 Jun 16 16:28 certs -> ../pki/tls/certs
Sep 30 11:13 /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt

The fact that there is no link in /var/lib/boinc to the certificate might be important, but if so, why is there no problem with the main CPDN page and no problem with WCG, Rosetta, and Universe?

Also, another difference is where you say, "And I note that we still have the (internal to CPDN) PHP warnings at the top of every page, so it looks like Oxford have some sorting out to do anyway...:,

I do not get any warnings on the top CPDN page: https://www.climateprediction.net/, though I do on all other pages.

"P.S. The certificate bundle was most recently updated on 28th September, 2021, round about when this root certificate expired.}
Mine is a little later, but I do not accept updates every day.

Sep 30 11:13 /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt
ID: 64570 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64572 - Posted: 3 Oct 2021, 7:36:40 UTC - in response to Message 64570.  

Problems at Oxford uni are most likely because they been having some re-cabling done, and probably other work as well, and things haven't gone well.
ID: 64572 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 64574 - Posted: 3 Oct 2021, 8:45:17 UTC - in response to Message 64570.  

... why is there no problem with the main CPDN page ...
Because your browser will be handling the security for that page, not the BOINC client.
ID: 64574 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,002,360
RAC: 21,497
Message 64575 - Posted: 3 Oct 2021, 9:00:46 UTC

For what it's worth, on Ubuntu that entry in boinc-client is a link to /etc/ssl/certs/ca-certificates.crt, which gets updated as necessary - I would expect most Linux distributions that have a properly curated BOINC client package to do something similar, but I could be wrong :-)


Should have noticed that! So if I create a similar link in my WINE installation, that should also update automagically?
ID: 64575 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 64576 - Posted: 3 Oct 2021, 9:11:38 UTC - in response to Message 64575.  

For what it's worth, on Ubuntu that entry in boinc-client is a link to /etc/ssl/certs/ca-certificates.crt, which gets updated as necessary - I would expect most Linux distributions that have a properly curated BOINC client package to do something similar, but I could be wrong :-)
Should have noticed that! So if I create a similar link in my WINE installation, that should also update automagically?
My Linux Mint machine is still using a certificate bundle dated 6 February 2021 - and it works just fine.

There's another reason why Windows machines are particularly prone to this. The BOINC client for Windows is still using outdated, buggy, security software (*). When it encounters the expired certificate, it should carry on checking the rest of the bundle. And if it did that, it would find a still-valid certificate further down the pile. But instead, it barfs on the expired certificate and gives up at that point.

(*) OpenSSL v1.0.2s
ID: 64576 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64580 - Posted: 3 Oct 2021, 14:08:06 UTC - in response to Message 64576.  

(*) OpenSSL v1.0.2s


I have openssl-1.1.1g-15.el8_3.x86_64 on my RHEL 8.4 Linux machine.
ID: 64580 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 64582 - Posted: 3 Oct 2021, 14:52:49 UTC - in response to Message 64580.  

Exactly. The Windows client ships with v1.0, which is known to be well behind the times and faulty.
ID: 64582 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 64583 - Posted: 4 Oct 2021, 6:22:19 UTC

Site been Pownd?
Sorry to think so, maybe?
Can't attach to project.
All on this thread speculate ssl problems.
Tried those fixes
No help.
WTF?
Also -- checking local wu records vs website records -- big mismatch for wu's last few months
Or have I just lost my mind?
HELP!
ID: 64583 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 64584 - Posted: 4 Oct 2021, 6:30:10 UTC - in response to Message 64583.  

Site been Pownd?
Sorry to think so, maybe?
Can't attach to project.
All on this thread speculate ssl problems.
Tried those fixes
No help.
WTF?
Also -- checking local wu records vs website records -- big mismatch for wu's last few months
Or have I just lost my mind?
HELP!


Or maybe it''s just the website blundering changing what I thought work-units stats that I've been recording all year to different stats now?

Yo no se.

Shirimasen
ID: 64584 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 64585 - Posted: 4 Oct 2021, 11:49:12 UTC

We are stuck until Berkeley wakes up? I do not believe in this statement. How has World Community Grid worked around this problem?
ID: 64585 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 64586 - Posted: 4 Oct 2021, 13:25:21 UTC - in response to Message 64585.  

WCG - still supported by IBM for the time being - will be using a different security certificate. There are something like 130 different certificates in the bundle used by BOINC, and only one of them has expired so far.
ID: 64586 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,700,823
RAC: 9,977
Message 64587 - Posted: 4 Oct 2021, 13:28:29 UTC - in response to Message 64585.  

A replacement version of BOINC requires the use of tools only authorised for use at the University of California, Berkeley. We'll have to wait for somebody to go to the lab - yes, you have to believe that - and there are signs that it may not take too long after that.
ID: 64587 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64596 - Posted: 8 Oct 2021, 20:30:43 UTC - in response to Message 64585.  

We are stuck until Berkeley wakes up?


It is now Friday so they have had all week.
Since I have no problems like this from WCG, Rosetta@Home, or Universe@Home, why is it that CPDN has this problem? Do none of the other projects use certificates?
ID: 64596 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64598 - Posted: 9 Oct 2021, 1:29:22 UTC - in response to Message 64596.  
Last modified: 9 Oct 2021, 1:29:41 UTC

There are some problems at Oxford that aren't affecting what little crunching is happening.
No need to worry, it'll be dealt with in time.
ID: 64598 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 64600 - Posted: 9 Oct 2021, 5:35:45 UTC

WU N216 only forty are in contact with the server. N144, only two are in contact with the server. Why cannot it be possible to clear up all these WU's out there in the wide blue yonder during this breakdown or whatever of CPDN? 25000 WU's of Wah2 are silent. So are others.
ID: 64600 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,002,360
RAC: 21,497
Message 64601 - Posted: 9 Oct 2021, 7:40:14 UTC - in response to Message 64600.  

WU N216 only forty are in contact with the server. N144, only two are in contact with the server. Why cannot it be possible to clear up all these WU's out there in the wide blue yonder during this breakdown or whatever of CPDN? 25000 WU's of Wah2 are silent. So are others.


That is 40 or 33 in last 24 hours actually completing tasks, rather than in contact with the server. Also doing what you suggest needs someone to actually be physically present at the computer due to the way Oxford configure their firewall. A lot of those WU's will be quietly trundling away, occasionally gaining credit.

Besides what you suggest has been put to those in Oxford many, many times in the past. I don't see policy on this changing unless there is a major change of personnel in charge of the project and even then it isn't guaranteed that they will take a different view.
ID: 64601 · Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 15 Jul 17
Posts: 99
Credit: 18,701,746
RAC: 318
Message 64602 - Posted: 9 Oct 2021, 14:19:59 UTC - in response to Message 64600.  
Last modified: 9 Oct 2021, 14:26:34 UTC

WU N216 only forty are in contact with the server. N144, only two are in contact with the server. Why cannot it be possible to clear up all these WU's out there in the wide blue yonder during this breakdown or whatever of CPDN? 25000 WU's of Wah2 are silent. So are others.
What does that mean, "in contact with the server?" I have 17 N216 WUs running and 63 Waiting to Run. Why aren't over 80 N216 WUs in contact with the server???

Oops, DJ explained in the following post I'd yet to read.

It's completely irrational to issue WUs with a 1 year deadline. They should have a one month deadline and have started or get a server abort. How many WUs go out and never come back??? Think about the aggregators like Charity Engine and Science United.
ID: 64602 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,002,360
RAC: 21,497
Message 64603 - Posted: 9 Oct 2021, 15:49:24 UTC - in response to Message 64602.  

It's completely irrational to issue WUs with a 1 year deadline. They should have a one month deadline and have started or get a server abort. How many WUs go out and never come back??? Think about the aggregators like Charity Engine and Science United.


Personally I agree the deadline should be much shorter but as moderators we can do no more than let those involved at Oxford know what is being said on the boards and our own opinion.
ID: 64603 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 149
Credit: 12,830,559
RAC: 228
Message 64604 - Posted: 9 Oct 2021, 17:36:05 UTC - in response to Message 64602.  

WU N216 only forty are in contact with the server. N144, only two are in contact with the server. Why cannot it be possible to clear up all these WU's out there in the wide blue yonder during this breakdown or whatever of CPDN? 25000 WU's of Wah2 are silent. So are others.
What does that mean, "in contact with the server?" I have 17 N216 WUs running and 63 Waiting to Run. Why aren't over 80 N216 WUs in contact with the server???

Oops, DJ explained in the following post I'd yet to read.

It's completely irrational to issue WUs with a 1 year deadline. They should have a one month deadline and have started or get a server abort. How many WUs go out and never come back??? Think about the aggregators like Charity Engine and Science United.


As a side issue, if you’re running 17 CPDN WUs at a time, 63 WUs reserve is over a month’s worth. Any particular reason for holding that many?
ID: 64604 · Report as offensive     Reply Quote
klepel

Send message
Joined: 9 Oct 04
Posts: 82
Credit: 69,922,704
RAC: 8,182
Message 64605 - Posted: 9 Oct 2021, 17:42:30 UTC

Let´s try with 4 months... I produced two ghost WUs by re-attaching to the project on Thursday https://www.cpdn.org/results.php?hostid=1522605 after I disconnected the monitor to try it on another computer. The 2 WUs will never be processed on this computer at all as all files are whipped out!
ID: 64605 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Site problems

©2024 cpdn.org