climateprediction.net (CPDN) home page
Thread 'What Happened ???'

Thread 'What Happened ???'

Message boards : Number crunching : What Happened ???
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 58172 - Posted: 4 May 2018, 15:17:25 UTC - in response to Message 58168.  


Climateprediction.net must hold the honour of probably being the worst maintained project in BOINC history.


I would suggest that it also has the honours of being both the most complicated BOINC project and also the most worthwhile for planetary survival.
ID: 58172 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 58173 - Posted: 4 May 2018, 17:53:31 UTC - in response to Message 58168.  

Climateprediction.net must hold the honour of probably being the worst maintained project in BOINC history.



You always have the option of running something else. Why don't you chase asteroids for a while. I understand WCG is starting a climate change project. In a few months you will have a choice.
ID: 58173 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 58174 - Posted: 4 May 2018, 19:17:35 UTC - in response to Message 58168.  

Climateprediction.net must hold the honour of probably being the worst maintained project in BOINC history.

Not at all. It is just the most recent one with problems. They always get our attention the most.

As for WCG, I will be building a new machine for it, and will do both. It is not either/or, but they address different questions.
ID: 58174 · Report as offensive     Reply Quote
CJ Xuereb

Send message
Joined: 24 Oct 16
Posts: 6
Credit: 1,866,525
RAC: 1,022
Message 58176 - Posted: 4 May 2018, 22:55:57 UTC - in response to Message 58173.  

You always have the option of running something else. Why don't you chase asteroids for a while. I understand WCG is starting a climate change project. In a few months you will have a choice.


I have been chasing Asteroids for over 3 years already. :)

However, I prefer to run a project where the benefit to humanity is more immediate, as is currently the case with climate change.

Don't get me wrong. I think Climateprediction.net is a great project. It's just a pity that it cannot be run better.
ID: 58176 · Report as offensive     Reply Quote
CJ Xuereb

Send message
Joined: 24 Oct 16
Posts: 6
Credit: 1,866,525
RAC: 1,022
Message 58177 - Posted: 4 May 2018, 23:46:16 UTC

A project of this importance in this day and age needs to run better.

I am sure that Andy is doing as best he can to restore the project but maybe there is only so much he can do with what he has available.

What will it take to make this project run better?

A funding drive to improve or replace the infrastructure, hire permanent staff, reprogram the applications?

The last thing I want is to see this project collapse because currently I do not think there is a better climate change project out there, but I fear this could become a reality if the current problems or lack of resources continue to plague the project.
ID: 58177 · Report as offensive     Reply Quote
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 17 Aug 04
Posts: 289
Credit: 44,103,664
RAC: 0
Message 58179 - Posted: 9 May 2018, 16:07:56 UTC

test post.
ID: 58179 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58184 - Posted: 12 May 2018, 5:34:09 UTC - in response to Message 58177.  



A funding drive to improve or replace the infrastructure, hire permanent staff, reprogram the applications?



cpdn is just one research project in one department of the University of Oxford.
Most of the uni probably don't even know it exists.

So you'll have to lower your expectations a little.

(And the current instability could be any where in or between the many buildings that make up the uni - cpdn uses storage space anywhere it can get it.)
You can see a copy of Andy's post about it here.
ID: 58184 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 1 Jun 17
Posts: 13
Credit: 30,434,235
RAC: 42,406
Message 58186 - Posted: 12 May 2018, 15:41:42 UTC

I have hundreds of tasks in various stage of completion which were suspended while the project got its act back together.

But it's clear that is not going to happen. Now that the latest "fix" has caused CPDN to vanish from the 2018 Formula BOINC Marathon, showing negative 4 billion points for the year, I am, with a mix of regret and disaffection, going to abort many hundreds of hours of ongoing CPU work done for this project.

I'm sure some will find my attitude appalling; it's fine if you want to judge my commitment to climate science and find it lacking. But do consider that there was an implicit agreement when CPDN set up a points system for those who like to play that way, and it's not me that broke that agreement.
ID: 58186 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 58189 - Posted: 12 May 2018, 16:55:46 UTC - in response to Message 58093.  

If the project is supposedly back on line, why do I keep getting this?

Unable to connect

Firefox can’t establish a connection to the server at www.cpdn.org.

The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer’s network connection.
If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web.

In the last year I have received only a few work units. Previously I could keep at least three cores busy 24/7.

16-May-2017 08:55:31Scheduler request completed: got 2 new tasks
19-May-2017 10:29:40 Scheduler request completed: got 2 new tasks
19-May-2017 11:37:53 Scheduler request completed: got 2 new tasks
30-May-2017 19:07:29 Scheduler request completed: got 3 new tasks
05-Jun-2017 07:36:30 Scheduler request completed: got 2 new tasks
22-Jun-2017 05:04:36 Scheduler request completed: got 2 new tasks
22-Jun-2017 20:39:42 Scheduler request completed: got 1 new tasks
27-Jun-2017 21:16:39 Scheduler request completed: got 2 new tasks
02-Jul-2017 13:35:06 Scheduler request completed: got 1 new tasks
11-Jul-2017 23:25:16 Scheduler request completed: got 3 new tasks
17-Jul-2017 01:08:14 Scheduler request completed: got 1 new tasks
18-Jul-2017 04:31:22 Scheduler request completed: got 1 new tasks
18-Jul-2017 11:07:35 Scheduler request completed: got 2 new tasks
19-Jul-2017 02:39:24 Scheduler request completed: got 2 new tasks
28-Jul-2017 12:02:20 Scheduler request completed: got 2 new tasks
02-Aug-2017 13:46:50 Scheduler request completed: got 2 new tasks
11-Aug-2017 03:22:21 Scheduler request completed: got 1 new tasks
18-Aug-2017 03:25:50 Scheduler request completed: got 1 new tasks
24-Aug-2017 21:00:01 Scheduler request completed: got 1 new tasks
25-Aug-2017 22:00:04 Scheduler request completed: got 1 new tasks
02-Sep-2017 02:56:56 Scheduler request completed: got 1 new tasks
13-Sep-2017 06:31:30 Scheduler request completed: got 1 new tasks
14-Sep-2017 05:33:41 Scheduler request completed: got 1 new tasks
20-Sep-2017 17:49:19 Scheduler request completed: got 2 new tasks
11-Oct-2017 17:05:29 Scheduler request completed: got 2 new tasks
14-Feb-2018 06:43:34 Scheduler request completed: got 2 new tasks
28-Mar-2018 19:39:29 Scheduler request completed: got 2 new tasks
ID: 58189 · Report as offensive     Reply Quote
Profilebcavnaugh
Avatar

Send message
Joined: 28 Jul 14
Posts: 14
Credit: 3,522,445
RAC: 0
Message 58190 - Posted: 12 May 2018, 17:15:35 UTC
Last modified: 12 May 2018, 17:16:15 UTC

New URL:
http://ithaqua.oerc.ox.ac.uk/cpdnboinc/

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.
ID: 58190 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58192 - Posted: 12 May 2018, 18:54:05 UTC

The project is not back on line. That's going to take weeks as has been said several times.

The project is running on the backup server, and Andy is having problems with some of the many scripts, which is why it's hard to login in at the moment.
As the moderators come into their day time, we're testing things, and emailing Andy, who won't get them until the world rotates into his day time.

And there is no work at present, so the only thing that people can do is return data, and complain on this board.

The url of the backup server is:
http://ithaqua.oerc.ox.ac.uk/cpdnboinc/


This can be seen thusly:
Go to the climateprediction.net main page
In the bottom right corner, click on BOINC User Pages
At the top left, under Join, 4th line down

-------------

And it's at this point where some people are having problems, due to the script errors.

Taking up an outdoor hobby away from computers is a good idea at this point. Marathon running perhaps. When you get back things may be better. :)

But we'll STILL be using the backup server for weeks yet, while Andy works on the original problem.
ID: 58192 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 58194 - Posted: 14 May 2018, 15:40:41 UTC - in response to Message 58192.  

And there is no work at present, so the only thing that people can do is return data, and complain on this board.

Complaining is fun for a while, but it has its limits.

I don't need more work downloaded, or even to upload trickles. (They at least are working). But if they could kick whatever needs to be kicked for the limited purpose of reporting the work already completed, I would consider it a reasonable compromise, and we could await a more complete cure later.
ID: 58194 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58195 - Posted: 14 May 2018, 22:10:02 UTC - in response to Message 58194.  

There is NO original database server at present. It was part of what became unstable.
Andy is currently slowly and methodically copying data off it and building a new "Main database server". Which takes time.

So there's nowhere for any data that people have from the old Main server to go.
Only data from models that were downloaded from the backup server can be returned to the backup server.

If all of the models that people reported downloading/having trouble getting parts of, were from the old main server, then people are stuck with any data that needs to be returned.

Address of the server currently being used (the old backup server):
http://ithaqua.oerc.ox.ac.uk/cpdnboinc/
ID: 58195 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 58196 - Posted: 14 May 2018, 23:40:06 UTC - in response to Message 58195.  

Let me ask this: If I have completed the work units and uploaded all trickles, has all the science been returned at that point? Or do the all the science results get finally uploaded only when the work unit is reported?

In other words, is the reporting only for credit purposes?
ID: 58196 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58197 - Posted: 15 May 2018, 8:47:15 UTC - in response to Message 58196.  
Last modified: 15 May 2018, 8:52:04 UTC

I'm probably going to get this wrong, but I think it works like this:

The zips contain the science data, and (mostly) go to servers outside of Oxford. They never show up on any BOINC backend pages where they can be seen.

The trickles, while they can contain small amounts of science data, aren't used for this now as far as I know. The trickle_up "files" are just for counting to produce credits.

The "Reporting" message is sent to some backend BOINC process to tell it that this particular task has completed, and there's no need to keep monitoring it. The task status is then changed on the web page associated with that task. (The Over / Completed / Success part.)

If the part of BOINC that's monitoring the task doesn't get a "completed" message from the user's computer, then it will continue to monitor it until the "deadline", and then report it as abandoned, even though all of the science data, and all of the trickles, have been received.
And the task will then be re-issued, unless it's reached it's quorum number.
ID: 58197 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 58204 - Posted: 20 May 2018, 20:52:16 UTC - in response to Message 58192.  

I find these posts confusing.
Hi All,

Just to let you know that the project is now back online, rather than running from the backup project server, we are running from the main project server. The infrastructure still remains at risk due to ongoing instabilities in the main VMware/GPFS infrastructure.

Best regards,

Andy
3 May 2018

The project is not back on line. That's going to take weeks as has been said several times.

The project is running on the backup server, and Andy is having problems with some of the many scripts, which is why it's hard to login in at the moment.
As the moderators come into their day time, we're testing things, and emailing Andy, who won't get them until the world rotates into his day time.
12 May 2018

I assume the second one supersedes the first. Since nothing at the usual URLs works at all:

Unable to connect

Firefox can’t establish a connection to the server at www.cpdn.org.

The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer’s network connection.
If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web.


This has been the case for weeks or months, but especially after 4 May 2018.
ID: 58204 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58205 - Posted: 20 May 2018, 21:20:35 UTC
Last modified: 20 May 2018, 21:28:25 UTC

There are 3 posts from Andy about this: 2 May, 3 May, 8 May.

To get the three in order, in may be easier to read them on the BOINC site, at the start of this thread, as I don't have email access on "this" computer, and don't have cpdn access on the computer where I do have access to email.

So copying data from email to message board gets messy at the moment.
ID: 58205 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 58206 - Posted: 21 May 2018, 5:59:21 UTC

The three posts are in order below if that is helpful to anyone.

The CPDN project is now offline. I will be taking a database dump of the database in order to resurrect the master-slave relationship on the two database servers in the project. In order to do this I need a database where no transactions are taking place. Once this is complete, it is likely we will start the project from the backup project server, rather than the main project server, due to ongoing instabilities in the main GPFS infrastructure.


Just to let you know that the project is now back online, rather than running from the backup project server, we are running from the main project server. The infrastructure still remains at risk due to ongoing instabilities in the main VMware/GPFS infrastructure.


As you might have noticed that the project went down again over the weekend. Again this was due to the main application machine on the underlying VMware/GPFS infrastructure. We have redirected the project services to the backup server (this is currently on HTTP only), please let me know if you have problems connecting.
ID: 58206 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 58207 - Posted: 21 May 2018, 14:25:53 UTC - in response to Message 58206.  

We have redirected the project services to the backup server (this is currently on HTTP only), please let me know if you have problems connecting.

Apparently this redirection is not entirely transparent to the users. At least I have 10 completed tasks waiting to be reported, even though all the trickles and zips have gone through OK.

Is there anything I can do on this end to fix it? I am attached to:
Master URL http://climateprediction.net/
ID: 58207 · Report as offensive     Reply Quote
Albert H.

Send message
Joined: 18 Feb 06
Posts: 73
Credit: 61,775,405
RAC: 46,188
Message 58208 - Posted: 21 May 2018, 19:43:16 UTC - in response to Message 58207.  

Hello, this must be very serious. Since I participate in CPDN, there where lots of outages, errors ...and so on. but nothing like this......

As new work is unavailable, I think stopping my computers until new work is available. As CPDN has the knowledge of my Mail address, is there a possibility to inform me when new work is there, or must I go, as a beggar, looking every day if I can fetch something ?
Sorry,
ID: 58208 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : What Happened ???

©2024 cpdn.org