climateprediction.net (CPDN) home page
Posts by David Wallom

Posts by David Wallom

InfoMessage
1) Message boards : Number crunching : Changes to website.
Message 71448
Posted 17 Sep 2024 by David Wallom
Hi Richard,

Yes this has been returned to its pre-issue status

KR

David
2) Message boards : Number crunching : Changes to website.
Message 71444
Posted 17 Sep 2024 by David Wallom
Hello,

In answer to the foresee question then no this was not foreseen as we were only changing the name of the server here in Oxford hving done so twice before with none of these issues. It would appear that by error the overall project name was changed, not just the server name, which has caused these issues. Andy is currently working to roll back these changes suitably to ensure that the project can be restarted and minimise impact.

Kind regards

David
3) Message boards : Number crunching : Changes to website.
Message 71434
Posted 17 Sep 2024 by David Wallom
Hello All,

We have closed the project down completely, though I am uncertain whether this will stop BOINC clients detaching from the project which is what they are doing. We are conducting further investigation.

Appologies for these unnecessary problems.

Kind regards

David
4) Message boards : Number crunching : Upload server is out of disk space
Message 67808
Posted 17 Jan 2023 by David Wallom
Hello Everyone,

We increased the number of concurrent uploads allowed to 150 from 50 and the server ended up indeed running out of space. This is with 5 parallel transfers and deletions of successful WU from jasmin-upload to the analysis space. We have temp restricted back to 100 and are seeing free space increasing, 1.5TB out of 24TB. Of the OpenIFS@Home batches, each has up to 800GB of successful workunits we are transferring off and there are 44 batches.

Thanks for your contributions

David
5) Message boards : Number crunching : The uploads are stuck
Message 67649
Posted 13 Jan 2023 by David Wallom
Hi,

The current limit is 50 concurrent connections.

Cheers

David
6) Message boards : Number crunching : The uploads are stuck
Message 67636
Posted 13 Jan 2023 by David Wallom
Hello All,

Brief update on status.

The upload server is back running and we are currently in the process of transferring ~24TB of built up project results from that system to the analysis datastores. This process is going to take ~5 days running 5 parallel streams (the files are all OpenIFS workunits).

I have asked Andy to restart uploads but to throttle to ensure that our total stored volume does keep decreasing, i.e. our upload rate doesn't exceed our transfer rate. As such we'll be slow for a while but will gradually increase the upload server bandwidth to you guys as we clear batches.

The issue was caused by an initial instability bought about because the system disks for the VMs that run the upload server and the data storage volumes are all actually hosted in the same physical data system. When the data volumes fill they affect the performance of the other disks as well.... This was exasperated because they allowed us to create extremely large volumes that were really beyond the capability of the storage system so we have to move the data internally as well. Not an idea solution and we've told JASMIN this.

Thank you for your understanding in whats been a difficult few days.

David
7) Message boards : Number crunching : Completed task fails to upload several times over last few days
Message 64136
Posted 6 Jul 2021 by David Wallom
Hi,

Indeed very odd as all of the other uploads for that WU are sitting waiting in the in_progress folder....?

Can you forward that zip to me directly by email please? david.wallom at oerc.ox.ac.uk

regards

David
8) Message boards : Number crunching : BOINC Client Improvements
Message 60633
Posted 11 Jul 2019 by David Wallom
Hello,

The BOINC community has been offered assistance from a design studio to improve the look, feel and functionality of the BOINC client. AS such part of this work would involve user studies/interaction, i.e. with you the volunteers. Would there be interest in participating in this?

Regards

David
9) Message boards : Number crunching : Upload failures
Message 60526
Posted 1 Jul 2019 by David Wallom
There are now 140+ parallel uploads onto the system.

David
10) Message boards : Number crunching : Upload failures
Message 60525
Posted 1 Jul 2019 by David Wallom
Hello All,

Apologies for the continued unavailability of the jasmin-upload system which we have been clearing out over the weekend. We have cleared 5TB of space since Thursday so will be re-enabling uploads imminently.

We are going to be reconfiguring the data transfer from the upload to the project storage over this week so that we will be able totake advantage of new capability within the JASMIN system to speed these transfers in future.

Regards

David
11) Message boards : Number crunching : Upload failures
Message 60475
Posted 27 Jun 2019 by David Wallom
Hello All,

We have currently stopped uploads to the JASMIN upload server to allow for the backlog to clear from the system. I will update when it is clearer at what rate this is occurring. One issue alongside these that we are trying to debug in parallel is a bandwidth limitation that we have run into on this system. The operators of the system are struggling to debug from their side since our usecase is so far outside the normal operating region for the system as a whole (no-one else is generating between 5TB & 6TB per day and trying not only to receive this onto a system but also in parallel trying to then move it off the system into other parts of the storage system.

Once todays processing has completed we can give a firm timeline on when the system will return into operation.

regards

David
12) Message boards : Number crunching : Credits
Message 59629
Posted 13 Feb 2019 by David Wallom
Hello All,

The issue with credit has been traced to the credit script not having been correctly installed following the rebuild of the primary database server after its latest failure before Christmas. We then ran on the backup system for well over a month but one of the failures notices was the enforced downtime on the project due to the newly introduced dump schedule to ensure we have a usable database backup unlike previously. Therefore when we moved back to the primary DB around the 13th Jan it wasn't noticed that the credit script wasn't operating correctly. This should be fixed now and therefore credit should appear on a regular basis again.

regards

David
13) Message boards : Number crunching : Credits
Message 59619
Posted 12 Feb 2019 by David Wallom
Hello,

Apologies that the credit issue has still not been fixed since Les last notified the project team. We will be investigating tomorrow as I had wrongly assumed this would have been fixed by now. Our volunteers are important to us and we understand your frustration. Please bear with us while we try and fix this

whilst under the constraints of project deliverables and time we are reviewing how we as a project interact with these boards to ensure there is a more regular project appearance here.

David
14) Message boards : Number crunching : Credits
Message 59618
Posted 12 Feb 2019 by David Wallom
Hi,

The issue yesterday and over the last few days with accessing the servers appeared to be a rogue process from a client somewhere that had launched well over 100 queries of the database searching for successful workunits on that particular host system. Following restarting of both the DB and scheduler those tasks have disappeared and not returned so far. If (when) they return we will have to look at who is the offending machine but this is tricky with the large number of active WU we have at the moment as we have to identify the offending httpd which is generating the DB query to find the IP etc.

David
15) Message boards : climateprediction.net Science : CPDN in 2016 – a look back over the last year
Message 55644
Posted 3 Feb 2017 by David Wallom
Hello All,

As we start 2017 the Science and Technical teams within CPDN & W@H thought it would be good to summarise the year past, to detail the work done and thank you the volunteers and moderators for your contributions.

http://www.climateprediction.net/cpdn-in-2016-a-look-back-over-the-last-year/

Kind Regards

The Oxford Team


©2025 cpdn.org