climateprediction.net (CPDN) home page
Thread 'Nearly there'

Thread 'Nearly there'

Message boards : Number crunching : Nearly there
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58286 - Posted: 23 Jun 2018, 19:59:06 UTC

Message from Andy just under an hour ago:

Hi All,


A brief update for you:


Gradually services are being restored. You will see that the www.cpdn.org/cpdnboinc site is now back and the message boards are also back. However the project is still offline, I am currently working on non-trivial issue with the apache configuration of the cgi link, without this link working the project cannot be brought back online.


Best regards,

Andy
ID: 58286 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58287 - Posted: 25 Jun 2018, 15:09:18 UTC

As an academic exercise (The more so because it was done from Linux) I tried attaching to the old url today and succeeded. However, attempting to update gives
Mon 25 Jun 2018 15:52:07 BST | climateprediction.net | Scheduler request failed: Couldn't connect to server


No great surprise but in anticipation I have set backup projects to no new tasks and when the server response is more positive will try switching back to WINE to try and get some work.
ID: 58287 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58289 - Posted: 26 Jun 2018, 11:12:45 UTC

More from Andy

Hi All,


Just to let you all know: the main project has now been restored. I have now rebuilt all the main project servers and services, and re-enabled the project. The project is now running from a new server: 'caerus.oerc'.


The project is currently only running from a master database, there is currently no slave server. The project has ordered a new machine to replace the slave database. Until the slave database machine arrives the project will be taken down on regular occasions to take a dump of the database.


Please let me know if you spot any issues.

Best regards,

Andy


I am still getting internet access OK project servers may be down message despite servers shown as running on their status page. I have let Andy know about this. He suggested reconnecting but that hasn't made any difference (yet.)
ID: 58289 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58290 - Posted: 26 Jun 2018, 11:31:53 UTC

And there are now tasks showing as ready to send!
ID: 58290 · Report as offensive     Reply Quote
ProfilePDW

Send message
Joined: 29 Nov 17
Posts: 82
Credit: 14,461,108
RAC: 90,797
Message 58291 - Posted: 26 Jun 2018, 12:31:51 UTC - in response to Message 58290.  

When is the results database going to be updated ?

I have a load of tasks showing as in progress in my website account that I have already run, completed and returned. Also credit is down by about 1.7 million from what it had been.
ID: 58291 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58292 - Posted: 26 Jun 2018, 12:55:01 UTC - in response to Message 58291.  

I expect there will be a few teething problems before this is all sorted out. I don't know what if anything was lost completely from the corruption of the database caused by the problems with the VM
ID: 58292 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 58293 - Posted: 26 Jun 2018, 14:09:39 UTC - in response to Message 58292.  
Last modified: 26 Jun 2018, 14:12:52 UTC

Tried adding CP back into my projects list. Scheduler request failed. Peer certificate cannot be authenticated with given CA certificates

6/26/2018 10:01:19 AM | climateprediction.net | Master file download succeeded
6/26/2018 10:01:24 AM | climateprediction.net | Sending scheduler request: Project initialization.
6/26/2018 10:01:24 AM | climateprediction.net | Requesting new tasks for CPU and AMD/ATI GPU
6/26/2018 10:01:26 AM | | Project communication failed: attempting access to reference site
6/26/2018 10:01:26 AM | climateprediction.net | Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates
6/26/2018 10:01:28 AM | | Internet access OK - project servers may be temporarily down.
6/26/2018 10:03:13 AM | climateprediction.net | Sending scheduler request: Project initialization.
6/26/2018 10:03:13 AM | climateprediction.net | Requesting new tasks for CPU and AMD/ATI GPU
6/26/2018 10:03:15 AM | | Project communication failed: attempting access to reference site
6/26/2018 10:03:15 AM | climateprediction.net | Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates
6/26/2018 10:03:16 AM | | Internet access OK - project servers may be temporarily down.

Hope new WU’s will be available soon and that they won’t be zombies from 2 years ago.
ID: 58293 · Report as offensive     Reply Quote
Andreas38871

Send message
Joined: 10 Aug 05
Posts: 4
Credit: 2,859,877
RAC: 467
Message 58294 - Posted: 26 Jun 2018, 14:44:30 UTC - in response to Message 58293.  
Last modified: 26 Jun 2018, 14:45:18 UTC

Same here.


Tried adding CP back into my projects list. Scheduler request failed. Peer certificate cannot be authenticated with given CA certificates

6/26/2018 10:01:19 AM | climateprediction.net | Master file download succeeded
6/26/2018 10:01:24 AM | climateprediction.net | Sending scheduler request: Project initialization.
6/26/2018 10:01:24 AM | climateprediction.net | Requesting new tasks for CPU and AMD/ATI GPU
6/26/2018 10:01:26 AM | | Project communication failed: attempting access to reference site
6/26/2018 10:01:26 AM | climateprediction.net | Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates
6/26/2018 10:01:28 AM | | Internet access OK - project servers may be temporarily down.
6/26/2018 10:03:13 AM | climateprediction.net | Sending scheduler request: Project initialization.
6/26/2018 10:03:13 AM | climateprediction.net | Requesting new tasks for CPU and AMD/ATI GPU
6/26/2018 10:03:15 AM | | Project communication failed: attempting access to reference site
6/26/2018 10:03:15 AM | climateprediction.net | Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates
6/26/2018 10:03:16 AM | | Internet access OK - project servers may be temporarily down.

Hope new WU’s will be available soon and that they won’t be zombies from 2 years ago.

ID: 58294 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58296 - Posted: 26 Jun 2018, 15:15:32 UTC

I am now getting this instead of servers may be down message. I have let Andy know. About half an hour or so ago, he re-implemented https and something is obviously wrong with this.

I am reminded of how difficult it was to get CPDN running on Linux in the early days of 64bit Linux.
ID: 58296 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58297 - Posted: 26 Jun 2018, 15:39:49 UTC
Last modified: 26 Jun 2018, 16:03:20 UTC

Andy has made some changes and I have let him have the new output from the event log saying what isn't working! I guess there will shortly be another change based on my own or others' feedback.

And the tasks ready to download may be a red herring as 300+ of them are hadam3cs tasks which are probably very old re-issues.
ID: 58297 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 28 May 17
Posts: 49
Credit: 17,297,301
RAC: 6,138
Message 58298 - Posted: 26 Jun 2018, 23:42:30 UTC

I'm guessing this is an old backup as half the credit is missing. Free-DC has me at 1,170,432 but here I'm at 570k. There's a bunch of other work w/o credit on top of that as well. Hopefully the work returned while the server was accepting work is still useful data.
ID: 58298 · Report as offensive     Reply Quote
Panthersprung

Send message
Joined: 14 Jan 07
Posts: 1
Credit: 3,536,199
RAC: 0
Message 58299 - Posted: 27 Jun 2018, 4:33:08 UTC

I do not get new tasks. Also the finished parts will be not uploaded. Is this a general problem or a local one?
ID: 58299 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58300 - Posted: 27 Jun 2018, 5:35:26 UTC - in response to Message 58299.  

I do not get new tasks. Also the finished parts will be not uploaded. Is this a general problem or a local one?


This is a general problem. There is a problem with the https authentication. Andy believes he has the answer but has gone over the number of authentication certificates that can be issued in a give time period. Not sure what the limit is and when the time period expires allowing him to try again. With luck it will be sometime today.
ID: 58300 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58301 - Posted: 27 Jun 2018, 8:02:17 UTC

Some progress. Able to attach what is effectively a new computer to the project now. Still getting authentication issues on trying to get work/update. I have sent Andy the bits from the event log that are relevant.
ID: 58301 · Report as offensive     Reply Quote
zaphod80013

Send message
Joined: 25 May 12
Posts: 8
Credit: 7,628,145
RAC: 3,979
Message 58302 - Posted: 29 Jun 2018, 0:40:47 UTC

Not sure if this is related to the ongoing issues but my account on bamstats shows a climate prediction credit for today of -995,953 thats almost 75% of my credits since joining the project in 2012 wiped out. (the project was dormant for a while but reactivated earlier this year Jan/Feb) What going on?
ID: 58302 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58303 - Posted: 29 Jun 2018, 6:44:53 UTC - in response to Message 58302.  

About 6 weeks ago, it was discovered that a problem with the virtual machine that ran the database in Oxford was corrupting said database. I don't know how far back Andy has had to go to ensure a clean backup.

I don't know what jiggery pokery (technical term) is possible with regards to restoring information about tasks processed since the back up and the associated credit. I would guess that most of this is recoverable but that any work on sorting that part of the project out will not happen until after Andy has got the project up and running again.
ID: 58303 · Report as offensive     Reply Quote
Art Masson
Avatar

Send message
Joined: 16 Oct 11
Posts: 254
Credit: 15,954,577
RAC: 0
Message 58304 - Posted: 29 Jun 2018, 13:13:59 UTC

Getting this now....looks like some progress but not there:

6/29/2018 8:11:39 AM | climateprediction.net | update requested by user
6/29/2018 8:11:43 AM | climateprediction.net | Fetching scheduler list
6/29/2018 8:11:48 AM | climateprediction.net | Master file download succeeded
6/29/2018 8:11:53 AM | climateprediction.net | Sending scheduler request: Requested by user.
6/29/2018 8:11:53 AM | climateprediction.net | Not requesting tasks: don't need (CPU: job cache full; NVIDIA GPU: not highest priority project)
6/29/2018 8:11:54 AM | | Project communication failed: attempting access to reference site
6/29/2018 8:11:54 AM | climateprediction.net | Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates
6/29/2018 8:11:56 AM | | Internet access OK - project servers may be temporarily down.
ID: 58304 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4538
Credit: 19,005,674
RAC: 21,647
Message 58305 - Posted: 29 Jun 2018, 14:55:32 UTC - in response to Message 58304.  

Hi Art, Andy is aware of this. He needs to, "get a new certificate from one of the listed providers" and presumably install it into the BOINC server software. Once that is done, fingers crossed, we might be ready to go but there again, we thought things might be ready a few days ago till some of us actually tried to attach the project or to update. At least we are getting further up the chain before an error message appears.
ID: 58305 · Report as offensive     Reply Quote
Brummig

Send message
Joined: 3 Nov 05
Posts: 26
Credit: 687,388
RAC: 529
Message 58306 - Posted: 29 Jun 2018, 16:14:30 UTC - in response to Message 58302.  
Last modified: 29 Jun 2018, 16:32:22 UTC

I've just lost 43,376 credits from CPDN, a drop of 15%. That's small beer compared with zaphod80013, but it's significant enough to result in my Boinc world ranking dropping 699 places today, instead of its usual increase. Will credits be restored?

And I still can't report my last WU as complete, even though some people have been, according to Boinc Stats.
ID: 58306 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58307 - Posted: 29 Jun 2018, 18:22:17 UTC

There's no information available about fine details, however the secure url is working for browsers.
This is: Main page

The unsecure url may work for some people and not for others. Andy has stopped the automatic forwarding of HTTP to HTTPS.

There's really only one way to find out what will happen to various items in the future, and that is to wait until the future arrives and then to look around.

So could everyone please remain calm until the boat docks and is securely moored.
:)
ID: 58307 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Nearly there

©2024 cpdn.org