climateprediction.net (CPDN) home page
Thread 'New work discussion - 2'

Thread 'New work discussion - 2'

Message boards : Number crunching : New work discussion - 2
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 42 · Next

AuthorMessage
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69341 - Posted: 16 Jul 2023, 17:23:24 UTC - in response to Message 69340.  
Last modified: 16 Jul 2023, 17:27:52 UTC

Looks like all the WAHs are now running smoothly.
I currently have 19 EAS uploads of around 118MB each queued waiting for upload7 to start working again. At least my NZ uploads are working.
Mine all say nz25. All my EAS tasks broke due to computer restarts or crashing in the first 10 seconds. The NZ are much faster, either due to being smaller, or because I (happened to by luck?) get only two on each of 7 machines. Hopefully being faster means I can finish some before one breaks due to an unauthorized Windows update reboot which I keep disabling in different ways and M$ keep finding a way around. Somebody really needs to sue them over that gross stupidity.

My Linux virtualboxes await the other variety, when are they coming?
ID: 69341 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 69342 - Posted: 16 Jul 2023, 17:33:42 UTC - in response to Message 69341.  

My Linux virtualboxes await the other variety, when are they coming?
Autumn was what Glen said many posts ago.
ID: 69342 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,880
RAC: 15,098
Message 69345 - Posted: 17 Jul 2023, 10:09:49 UTC
Last modified: 17 Jul 2023, 10:10:09 UTC

Update on EAS25 batch upload issues

The OS upgrade on the Korean server has been finished. Some of the OS packages required by the boinc software were outdated and suspected of causing problems, hence the upgrade.

The upload7 server is not running yet, it needs more work on the boinc side before uploads can be turned back on. Might be 2 weeks before this happens - please don't delete the tasks if possible.
---
CPDN Visiting Scientist
ID: 69345 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 69346 - Posted: 17 Jul 2023, 10:41:22 UTC - in response to Message 69345.  

Thanks Glen. Storage isn't a problem for me but I will prioritise the other tasks. As well as two for the smaller region I got another three of the ten year ones from testing. I have suspended these till the rest are finished.
ID: 69346 · Report as offensive
Aurum
Avatar

Send message
Joined: 15 Jul 17
Posts: 99
Credit: 18,701,746
RAC: 318
Message 69525 - Posted: 24 Aug 2023, 14:31:06 UTC - in response to Message 69336.  
Last modified: 24 Aug 2023, 14:31:48 UTC

Weather & climate models are very non-linear. Small differences in numerical calculations can quickly cause big differences in runs of identical code on different hardware (many, many, years ago this was a topic of my PhD). There are places in the code where just a single bit difference is enough. For example, for a cloud to form the air must be saturated, so the code computes the saturation at each grid-point and compares it to the value needed for a cloud to form. A single bit difference in that comparison is all you need to have, or, not have a cloud form. A cloud makes significant changes to its local environment.

Differences in the numerics can come from different rounding in the processor, differences in numerical libraries the code might be linked to. Code errors that might be reading random memory locations can also cause small differences (maybe not enough to crash the model).

There have been studies to look at this in the very early days of CPDN on the long-running climate models.
Glenn, I wonder if running climate simulations on the DynexSolve distributed neuromorphic computing network might be faster and/or more accurate. It's still a work in progress and they might welcome the extra load. Their devs are active on their Discord channel. https://dynexcoin.org/about
ID: 69525 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 69526 - Posted: 24 Aug 2023, 14:55:51 UTC

Glenn, I wonder if running climate simulations on the DynexSolve distributed neuromorphic computing network might be faster and/or more accurate
It might be if the project had more than just Andy for IT support. Also the project doesn't have anyone with GPU programming experience or so I have been told.
ID: 69526 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69528 - Posted: 24 Aug 2023, 15:24:22 UTC - in response to Message 69525.  

Glenn, I wonder if running climate simulations on the DynexSolve distributed neuromorphic computing network might be faster and/or more accurate. It's still a work in progress and they might welcome the extra load. Their devs are active on their Discord channel. https://dynexcoin.org/about[/size]
https://www.virustotal.com/gui/file/ace9dd93beae65218cfc7abdbaf6d22e58e0075ed9b1cbf3fd76c153cd1c0eeb

https://medium.com/@ares_61826/why-we-believe-the-dynex-cryptocurrency-is-a-scam-from-sec-sanctioned-daniel-mattes-561bbabbd89a

Dynex is a scam.
ID: 69528 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 69529 - Posted: 24 Aug 2023, 19:20:36 UTC - in response to Message 69528.  

Dynex is a scam.
Certainly enough red flags that I wouldn't touch it with a barge pole. Not proven to be a scam but no way am I going near it.
ID: 69529 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69530 - Posted: 24 Aug 2023, 19:27:27 UTC - in response to Message 69529.  
Last modified: 24 Aug 2023, 19:30:24 UTC

Dynex is a scam.
Certainly enough red flags that I wouldn't touch it with a barge pole. Not proven to be a scam but no way am I going near it.
Compare it with Gridcoin, which only gets detected by 1 antivirus instead of 43.... https://www.virustotal.com/gui/file/efe7ed5d983f43eb35b3e73db617e12c6254a7d3354aabac53ed6cda24de4c0c

I told AVG staff to have a thorough look at it anyway. Sometimes they actually get back to you through the program and tell you the result.

However my second link seems to imply Daniel Mattes (assuming it's the same one) is a bad guy, but he doesn't look bad to me: https://en.wikipedia.org/wiki/Daniel_Mattes
ID: 69530 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,880
RAC: 15,098
Message 69531 - Posted: 24 Aug 2023, 20:43:18 UTC - in response to Message 69525.  

Glenn, I wonder if running climate simulations on the DynexSolve distributed neuromorphic computing network might be faster and/or more accurate. It's still a work in progress and they might welcome the extra load. Their devs are active on their Discord channel. https://dynexcoin.org/about
Not something I have heard of - I have no idea what a neuromorphic network is; I presume it's just a fancy name for a neural network implemented on GPUs. Can't say it's of interest.

There is no existing code for GPUs in any of the models that CPDN use. The forecasting centres are however working on including GPUs in the model codes. Try searching for 'atmospheric ocean models using GPU' and it will give hits.
ID: 69531 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69532 - Posted: 24 Aug 2023, 21:16:32 UTC - in response to Message 69531.  

There is no existing code for GPUs in any of the models CPDN use. The forecasting centres are however working on including GPUs in the model codes. Try searching for 'atmospheric ocean models using GPU' and it will give hits.
I assume those aren't available for us to run somewhere? Everyone using those does them in-house?
ID: 69532 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69535 - Posted: 27 Aug 2023, 20:00:11 UTC

Is WAH done for now? Wasn't the second last batch faulty and needs redone, the East Asia ones which crashed a lot?
ID: 69535 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 69536 - Posted: 28 Aug 2023, 7:53:38 UTC - in response to Message 69535.  

Is WAH done for now? Wasn't the second last batch faulty and needs redone, the East Asia ones which crashed a lot?
The fresh batch should arrive at some point but probably not before some more testing batches have appeared. The Korean scientist who is after the work will let us know when she is ready with the modified tasks.
ID: 69536 · Report as offensive
ProfileConan
Avatar

Send message
Joined: 6 Jul 06
Posts: 147
Credit: 3,615,496
RAC: 420
Message 69537 - Posted: 28 Aug 2023, 9:07:14 UTC

Any new work for 64 bit coming along? I noticed a couple of new entries on the server status page

OpenIFS 43r3
OpenIFS 43r3 Baroclinic Lifecycle
OpenIFS 43r3 Perturbed Surface
OpenIFS 43r3 Cubic Octahedral grid tco95 l91
OpenIFS 43r3 Linear grid tl255 l91


Thanks
Conan
ID: 69537 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,033,008
RAC: 19,749
Message 69538 - Posted: 28 Aug 2023, 9:30:48 UTC

Any new work for 64 bit coming along? I noticed a couple of new entries on the server status page


That just means, the initial work on the model types has been started. I would guess we are talking months rather than weeks before anything appears.
ID: 69538 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,880
RAC: 15,098
Message 69539 - Posted: 28 Aug 2023, 11:43:52 UTC - in response to Message 69538.  

Any new work for 64 bit coming along? I noticed a couple of new entries on the server status page
That just means, the initial work on the model types has been started. I would guess we are talking months rather than weeks before anything appears.
I hadn't realised Andy has put them on the main site. Those models are still in development and may never be used - though I hope they will. They are the large memory OpenIFS variants.

Someone else asked about the Korean work. The spinup for the smaller region configuration has worked and may be sent out, but I'm working with Sarah to modify the WaH code so we can send out the original batch with fixes. CPDN will probably decide how to proceed next week when people come back from holiday.
---
CPDN Visiting Scientist
ID: 69539 · Report as offensive
klepel

Send message
Joined: 9 Oct 04
Posts: 82
Credit: 69,926,845
RAC: 6,801
Message 69541 - Posted: 30 Aug 2023, 18:54:50 UTC

Glen
I would like to ask you, do you have any idea how many of the new OpenIFS work(-unites) will have to be processed and how long they might last based your previous experience with the “test-batch” (I can´t remember, was it last year or this year?)?

I’m asking as I bought 64 GB of RAM for climateprediction – do not really need for my daily tasks – the last time and I got a little bit upset as this OpenIFS units lasted only a few months and then you made the comment, the big OpenIFS unites might never make it to BOINC at all.

Now I am looking forward to the new badge of OpenIFS units announced for October and I am wondering if I shall buy 128 GB of additional RAM for another computer in preparation/advance, however I really would like to avoid the same experience from last time, that this investment is only worth for a few months. Especially as I would be able to buy for the similar amount a graphic card for GPUGRID.

Thanks a lot for your comments
klepel
ID: 69541 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,454,880
RAC: 15,098
Message 69542 - Posted: 30 Aug 2023, 19:40:12 UTC - in response to Message 69541.  

I wouldn't suggest buying 128Gb of RAM, not unless you need it for something else. CPDN do not put out enough work in a year to justify spending on hardware (well, that's what I tell my wife anyway and then I try to get to the delivery packages before she sees them..... :)

I said 'may not be used' because it's not my decision to put them into production - that's down to the Director of CPDN once we have run the tests. At the moment I have no reason to think they won't be but I don't know for sure. My idea is to use these new configs to repeat some experiments made couple of years ago with the existing OIFS config, and write it up to advertise this new capability. Hopefully to attract more scientists to use the platform.

I can tell you we have successfully run higher resolution configurations of OpenIFS on the dev-site that use 11Gb & 20Gb RAM and CPDN has okayed testing an even higher one that uses ~28Gb RAM. I don't think we will go beyond that yet, as these models also produce more output that might cause issues when uploading (plus more I/O to disk). Because of the memory size, CPDN will also limit the no. of 'in progress' tasks a user can have so 64Gb you have already is fine.

I can't answer when the new batches will go out - the model is working fine, it's just all the work around it for CPDN & boinc that takes the time. It's been agreed that credit will be increased for these hi-res configs to account for the increased memory used (which I think is only fair). But we still have a number of other steps to take first.

Hope that answers your questions.

Glen
I would like to ask you, do you have any idea how many of the new OpenIFS work(-unites) will have to be processed and how long they might last based your previous experience with the “test-batch” (I can´t remember, was it last year or this year?)?

I’m asking as I bought 64 GB of RAM for climateprediction – do not really need for my daily tasks – the last time and I got a little bit upset as this OpenIFS units lasted only a few months and then you made the comment, the big OpenIFS unites might never make it to BOINC at all.

Now I am looking forward to the new badge of OpenIFS units announced for October and I am wondering if I shall buy 128 GB of additional RAM for another computer in preparation/advance, however I really would like to avoid the same experience from last time, that this investment is only worth for a few months. Especially as I would be able to buy for the similar amount a graphic card for GPUGRID.

Thanks a lot for your comments
klepel
ID: 69542 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 69543 - Posted: 30 Aug 2023, 21:33:54 UTC - in response to Message 69542.  
Last modified: 30 Aug 2023, 21:35:57 UTC

I can tell you we have successfully run higher resolution configurations of OpenIFS on the dev-site that use 11Gb & 20Gb RAM and CPDN has okayed testing an even higher one that uses ~28Gb RAM. I don't think we will go beyond that yet, as these models also produce more output that might cause issues when uploading (plus more I/O to disk). Because of the memory size, CPDN will also limit the no. of 'in progress' tasks a user can have so 64Gb you have already is fine.


My main machine runs Red Hat Enterprise Linux release 8.8 (Ootpa) and is like this:

Computer 1511241

CPU type 	GenuineIntel
Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7]
Number of processors 	16
Operating System 	Linux Red Hat Enterprise Linux
Red Hat Enterprise Linux 8.8 (Ootpa) [4.18.0-477.15.1.el8_8.x86_64|libc 2.28]
BOINC version 	7.20.2
Memory 	125.08 GB
Cache 	16896 KB
Swap space 	15.62 GB
Total disk space 	488.04 GB
Free Disk Space 	480.57 GB
Measured floating point speed 	6.02 billion ops/sec
Measured integer speed 	25.36 billion ops/sec
Average upload rate 	139.34 KB/sec
Average download rate 	22391.7 KB/sec


I normally have it run 12 Boinc tasks at a time. My Internet connection isVerizon FiOS guaranteed to run at 75 Megabits/second. It acrually gets response like this. CPDN reports slower upload speeds than download speeds.I do not know why the speeds should be so different. I do not believe the download speeds are as fast as CPDN says. Those speeds could be true if they were in Kilobits per second, but KBytes per second is not really possible.
When I was getting oifs jobs, the trickles went up quite fast as long as the upload servers were running. I guess I could run one 28 GByte Oifs task at a time as well as some smaller tasks at the same time.
Timestamp 	  Download   Upload 	Latency Jitter Quality Score Test Server
8/30/2023 17:9:28 76.65 Mbps 89.02 Mbps 4 ms    2 ms   Excellent     speedtest1.nyc1.nitelusa.net.prod.hosts.ooklaserver.net
6/7/2023 20:3:31  78.13 Mbps 63.66 Mbps 6 ms    1 ms   Excellent     ny2.speedtest.gslnetworks.com.prod.hosts.ooklaserver.net
5/5/2023 11:23:28 76.26 Mbps 89.16 Mbps 6 ms    1 ms   Excellent     speedtest.nyc.rr.com

ID: 69543 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69544 - Posted: 31 Aug 2023, 3:46:54 UTC - in response to Message 69543.  
Last modified: 31 Aug 2023, 3:47:31 UTC

Jean-David and me could both run 4 28GB tasks at once. It would be a pity to limit the number. Can the server alter the limit based on the host's RAM? Also I take it this limit is per host? I have 10 machines....
ID: 69544 · Report as offensive
Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 42 · Next

Message boards : Number crunching : New work discussion - 2

©2024 cpdn.org