Message boards : Number crunching : News and Announcements
Message board moderation
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · Next
Author | Message |
---|---|
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Andy reports he was advised that the additional server is installed and being configured. It should be online tomorrow (Thursday) morning. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Jonathan says: Hi, I have to take the project database off-line for the first of (hopefully only) two upgrades at the end of this week. I will schedule the downtime from: 12 Noon GMT on 10 Jan until 12 Noon on 13 Jan 2014. There will be no database access during this downtime. For those who are interested, this the first step towards retiring our old database server, and moving her functions over to the virtualised infrastructure at the Oxford e-Research Centre. Jonathan Miller CPDN System-Administrator Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
You will see that things are up and running again apart from one upload server. Jonathan now needs to make an extra backup and will be closing down the databases again within an hour from now. He expects this extra backup to be completed within 24 hours. When this forum and consequently this News thread are down we always try to post updates on the boinc_dev forum in the Projects section where there's a thread for news of server outages from all projects here. Cpdn news |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,904,898 RAC: 2,026 |
There has been a recent unintended release of old HADCM3N models. If you have a model whose work unit has any result marked as "no resubmission" then please abort it. Edit: This should apply only to HADCM3N models whose names begin with '7' - such as this one of mine hadcm3n_7ch1_1980_40_008427448_1. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
More about unwanted models: The "no re-submission" only appears the first time it's applied to the data set. Subsequent reappearances caused by the BOINC problem don't have the message. You need to look in the Work unit column to see it. There is now a very long "deadline" to block BOINC's automatic and unwanted re-submissions for a VERY long time. Currently this is the year 2023. So, keep aborting them until they're all gone. And, as there's no real work at the moment, setting cpdn to No new tasks in the Projects tab until there is, will stop you from getting more of these unwanted tasks. Backups: Here |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
A batch of 5,500 HadAM3P PNW tasks was released this morning. They were initially getting download errors (due to an incorrect download URL), but that was fixed at 1240 UTC. The server status page is currently showing that 3,134 have still to be sent. The models are being run with a new version (7.22) of the application. This has a completely new graphics engine and changes to the post-processing. It was tested on CPDN Beta last week where a number of graphics issues were reported. These will be fixed but the timeline for the new batch of work was too tight for that to be done before releasing the work. The known problems are:
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
A batch of work for the RAPIT project (HadCM3N) has been released this afternoon (the workunt for the task I just picked up was created an hour ago). Unsurprisingly they are disappearing very quickly (according to the server status page there are 867 left). Edit: 5 minutes later and we're down to 561 left ... "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The data sets for the ANZ Weather@Home project are starting to appear. There's a page about it the front section, here |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Norwegian member Tullus has designed an extra Server Status page which as well as showing the number of models of each type available has graphs to show model availability through time. Here it is: http://ob.cakebox.net/cpdn_status/server_status.html He's still working on it and discussing it in this forum thread. If Tullus's clever app ever isn't available, CPDN's official Server Status page always has a link in the menu on the left-hand-side of this page. (If uploads or downloads aren't working properly you should still look there to see whether any servers are down.) Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
CPDN Beta Andy has said: Hi All, The Beta server has a serious issue at present that's preventing it being updated, I am afraid it is going to have to remain offline until we can look at it again tomorrow. Andy Cpdn news |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Warning BOINC version 7.2.39 has a sever bug which causes file transfer errors. If you're still using this version, please upgrade IMMEDIATELY to 7.2.42 |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
NO RESUBMISSION tasks are in the mix again. See the second paragraph of mo.v's post here: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7739&nowrap=true#48049 If the HadCM3N task is named 7??? and the Work Unit is marked 'No Resubmission' it should be aborted. Sorry about that. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
A large batch of data sets has become available for a new model type, hadam3pm2. These are models that use the new "land surface scheme", MOSES II. (Surface Exchange Scheme). They're for a project called HYDRA. These models will create very large amounts of data to be uploaded, so: Warning zips 1-9 are about 64.5 Megabytes, and zip 10 is just under 95 Megabytes. Also, the BOINC listed values for "Remaining (estimated)", and the "Percentage done", are completely and totally incorrect. On my Haswell machine with 4770K processor, and running no other projects, they took 218 hours. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Following up on Les's post on the new MOSES II model, I'd also like to add 1. These are for Linux and Mac only at this time 2. The graphics do not work. A large batch of data sets has become available for a new model type, hadam3pm2. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
From Jonathan: Hi, |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Message from Jonathan: We are undertaking maintenance work on the Virtual Machine server infrastructure at Oxford e-Research Centre. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
During uploads of EU zips, I've started getting the following messages for cpdn-upload2.oerc: Server is out of disk space, and No space left on server. This started about 20 minutes ago. I've emailed the project people. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
More upload server failures :( The project has been notified. Edit Just to be clear, they also know about the Feeder failure. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Staff has been advised about Trickle failures and outages. Andy is looking onto it. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
As the Number Crunching threads about credits have attracted a lot of "chatter", I've decided to post here. The credit bloat was due to the script somehow getting run twice, using the same temp file without a "clear" in between. It was said last week that it was going to be re-done, but nothing has happened. There has been no contact with the project people for a while now. The University of Oxford is still on Long Vacation, with, I would guess, minimal staff everywhere. And this would be the time that serious building work would be done. If anything goes wrong with the network of computers, "our" two people may well be required/requested to help out elsewhere for a while. If anyone has an urgent need for constant, accurate, credits to keep track of something, then cpdn is, currently, not the place to be. Hopefully things will get better sometime this decade. :) |
©2025 cpdn.org