climateprediction.net (CPDN) home page
Thread 'World Community Grid mostly down for 2 months while transitioning'

Thread 'World Community Grid mostly down for 2 months while transitioning'

Message boards : Cafe CPDN : World Community Grid mostly down for 2 months while transitioning
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 68568 - Posted: 9 Mar 2023, 21:54:34 UTC

sigh

WCG has been hard-down again, for some hard drive failure issue or another.

https://twitter.com/WCGrid

Before that, they were dealing with severe shortages of WUs, because the scripts tried to predict how much work would be done, based on past performance. Except, if past performance was low due to lack of WUs to send, it would predict low future performance and not generate many WUs.

I'm doing some video transcode on my boxes right now, but once that's done, I suppose they go into shut down archive mode and I heat on resistors.
ID: 68568 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68569 - Posted: 9 Mar 2023, 22:11:32 UTC - in response to Message 68568.  

sigh

WCG has been hard-down again, for some hard drive failure issue or another.

https://twitter.com/WCGrid

Before that, they were dealing with severe shortages of WUs, because the scripts tried to predict how much work would be done, based on past performance. Except, if past performance was low due to lack of WUs to send, it would predict low future performance and not generate many WUs.

I'm doing some video transcode on my boxes right now, but once that's done, I suppose they go into shut down archive mode and I heat on resistors.
There are so many other useful things you can run. Assuming you mean CPUs, Rosetta just got 9 million tasks in. Sidock's got a load of longer tasks.
ID: 68569 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 68570 - Posted: 9 Mar 2023, 22:23:04 UTC - in response to Message 68569.  
Last modified: 9 Mar 2023, 22:23:46 UTC

There are so many other useful things you can run. Assuming you mean CPUs, Rosetta just got 9 million tasks in. Sidock's got a load of longer tasks.


Yeah, all I have are CPUs. I just don't like running "math for the sake of doing compute" sort of projects like prime finders or such.

Where are you seeing 9M Rosetta tasks? http://boinc.bakerlab.org/server_status.php argues a lot lower, though I suppose I can toss their stuff in my mix for when everything else is out.
ID: 68570 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68571 - Posted: 9 Mar 2023, 23:47:08 UTC - in response to Message 68570.  
Last modified: 9 Mar 2023, 23:47:55 UTC

Yeah, all I have are CPUs. I just don't like running "math for the sake of doing compute" sort of projects like prime finders or such.
Agreed. But there's loads of biology and physics to do. If you can run VB stuff, go on LHC and discover subatomic particles. They're why we have MRI scanners.

Where are you seeing 9M Rosetta tasks? http://boinc.bakerlab.org/server_status.php argues a lot lower, though I suppose I can toss their stuff in my mix for when everything else is out.
Unlike most projects, they're honest and admit the real full queue. It's on the home page. Server status just shows the RAM buffer. No idea why projects choose to hide a key piece of information from their volunteers.
ID: 68571 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 318
Credit: 14,987,679
RAC: 9,968
Message 68572 - Posted: 10 Mar 2023, 5:26:24 UTC

sigh

WCG has been hard-down again, for some hard drive failure issue or another.

To say that WCG has struggled since leaving IBM would be an understatement.

Some projects that have plenty of CPU work that I find interesting enough to run, and are not just pure math projects, are: Rosetta, Einstein, Universe, MilkyWay, LHC.

Yes, like mentioned above, Rosetta is a bit unique in that on the home page they have a stats section that lists, among other things, total queued jobs. The traditional server status page just shows the buffer from which clients get tasks, they seem to keep it at around 30k. Einstein also has that kind of info on their server status page broken down by sub-project although that section sometimes disappears.
ID: 68572 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 68574 - Posted: 10 Mar 2023, 6:36:15 UTC - in response to Message 68569.  

Rosetta just got 9 million tasks in.


Rosetta has only this many at the moment.
Tasks in progress	329539
Tasks ready to send	 29905

ID: 68574 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68575 - Posted: 10 Mar 2023, 6:48:06 UTC - in response to Message 68574.  
Last modified: 10 Mar 2023, 6:49:11 UTC

Rosetta just got 9 million tasks in.
Rosetta has only this many at the moment.
Tasks in progress	329539
Tasks ready to send	 29905
Incorrect, on the main page:



The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding?
ID: 68575 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 68577 - Posted: 10 Mar 2023, 10:29:19 UTC - in response to Message 68575.  

The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding?
Can't comment on other projects, especially those I have had little involvement with but CPDN only updates the server status page every couple of hours which means small batches of work can be gone before they show up on the page.
ID: 68577 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68578 - Posted: 10 Mar 2023, 11:25:27 UTC - in response to Message 68577.  

The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding?
Can't comment on other projects, especially those I have had little involvement with but CPDN only updates the server status page every couple of hours which means small batches of work can be gone before they show up on the page.
This is common - the updating frequency is low on other projects too. Rosetta is odd, since the updating time of server status and the updating time of the main page are different, so one is usually ahead of the other, but not always the same one.
ID: 68578 · Report as offensive     Reply Quote
alanb1951

Send message
Joined: 31 Aug 04
Posts: 37
Credit: 9,581,380
RAC: 3,853
Message 68583 - Posted: 10 Mar 2023, 19:03:12 UTC - in response to Message 68578.  

The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding?
Can't comment on other projects, especially those I have had little involvement with but CPDN only updates the server status page every couple of hours which means small batches of work can be gone before they show up on the page.
This is common - the updating frequency is low on other projects too. Rosetta is odd, since the updating time of server status and the updating time of the main page are different, so one is usually ahead of the other, but not always the same one.

The standard BOINC server status page can only see work that has been prepared for sending out, as it can only see the BOINC database. That's probably the most honest indication of available work for most projects, since the total amount of work may be indeterminate for one reason or another! (And you can't hide what you don't know...)

There are some projects out there that have an exactly known number of work units (if all goes well) -- individual batches here at CPDN may be examples, ARP1 sub-project at WCG is another1, and I'm sure there are many more. However, more common seem to be projects such as MilkyWay (both sub-projects) or WCG projects such as MCM1, OPN1/OPNG and SCC12 that run until some target is hit3, whilst there may also be genuinely open-ended ones (lots of "mathematical" projects?)

I'd actually be interested to know where Rosetta gets that very large number from (and its likely accuracy) -- it's almost certainly not coming from anything in the BOINC database itself, otherwise it could possibly update at the same time(s) as the server status page :-)

Cheers - Al.

P.S. I wonder if knowing how much work is available long-term is only of major interest to badge-hunters? The clamour at WCG when a project was known to be nearing its end used to be quite something to behold... :-)

1 The only argument about the number of work units was whether the year of data (at two days per "generation") would have 364 days or 366 -- a difference of 35609 work units!

2 In the case of SCC1 there has been such a long hiatus that many think it won't return, but childhood cancers are a key research area so I mention it anyway...

3 Those projects process data for a given "target" for as long as the scientists deem appropriate (e.g. MilkyWay runs a set of streams "until converged"), then they move on to another target.
ID: 68583 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 68584 - Posted: 10 Mar 2023, 20:05:00 UTC - in response to Message 68571.  

If you can run VB stuff, go on LHC and discover subatomic particles. They're why we have MRI scanners.


I'll have to see, I don't have a huge amount of bandwidth and most of my compute machines are RAM starved if it's running a VM per core - I built for around 1.5GB/core, and that's been a limit, though I'm hitting RAM bandwidth/latency/cache limits anyway before I run out of RAM, based on overall system throughput measurements.

That Rosetta seems to have useful work is good, I'll let some of that run. I think they got run out of WUs some while back during Covid, didn't they?
ID: 68584 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68586 - Posted: 11 Mar 2023, 2:50:01 UTC - in response to Message 68583.  

I'd actually be interested to know where Rosetta gets that very large number from (and its likely accuracy) -- it's almost certainly not coming from anything in the BOINC database itself, otherwise it could possibly update at the same time(s) as the server status page :-)
The scientists presumably create a huge batch of several million targets and put them in a folder somewhere on the server. The server keeps track of how many are left. The actual Boinc server presumably can't handle that many, so they're spoon fed to the useless POS software a few thousand at a time.

P.S. I wonder if knowing how much work is available long-term is only of major interest to badge-hunters? The clamour at WCG when a project was known to be nearing its end used to be quite something to behold... :-)
I don't hunt badges, I'm doing it for the good of mankind. But it's nice to know how much work is available at a project so I can choose which other ones to run.
ID: 68586 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68587 - Posted: 11 Mar 2023, 2:56:40 UTC - in response to Message 68584.  

I'll have to see, I don't have a huge amount of bandwidth
LHC's CMS uses a lot of bandwidth. I have 122 CPU cores, but only 7Mbit upload. To run all that on LHC's CMS, I'd need three times the bandwidth, about 20Mbit. But the other subprojects there are not bandwidth hungry.

and most of my compute machines are RAM starved if it's running a VM per core - I built for around 1.5GB/core, and that's been a limit, though I'm hitting RAM bandwidth/latency/cache limits anyway before I run out of RAM, based on overall system throughput measurements.
LHC's ATLAS tasks can run on up to 8 cores. I think the formula is something like 2GB per task + 1GB per core. So you can run 10GB tasks on 8 cores, which should be within your amount.

That Rosetta seems to have useful work is good, I'll let some of that run. I think they got run out of WUs some while back during Covid, didn't they?
They had an enormous batch of python apps running, which not only required VB but also a modern CPU with AVX, so not many people were able to run them. The trouble was they tried to run on computers that couldn't run them, and there was a mess. It took a long conversation between me and several other users to determine which instruction was required in the CPU to run them without failing. The other problem was no admins in the forum to listen. No way to contact anyone at all. There is now one scientist in there though. Steven Rettie I think. And their currnet tasks are not VB, although they're still working on fixing the bugs in the VB one so it may be back in full force soon.
ID: 68587 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 68628 - Posted: 26 Mar 2023, 22:28:11 UTC

Well, those 9M Rosetta tasks have been disappearing at a rate of around 500k/day. Down to 1.5M left... not much more work to be found there, unless they've got new stuff coming. I suppose I'll slosh compute over to F@H and help run them out of tasks afterwards... or get off my rear and get some more video transcode work lined up for my boxes.

I've got a bunch of the Google Compute Engine C3 beta machines (they're free while in beta!) chewing on R@H and F@H too right now. ;) Something like 44 high end Intel CPU cores...
ID: 68628 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68629 - Posted: 26 Mar 2023, 23:47:10 UTC - in response to Message 68628.  

Can't see "beta", but I have found "free trial" with $400 credit. I've signed up and will attempt to use it.
ID: 68629 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68630 - Posted: 27 Mar 2023, 0:16:33 UTC - in response to Message 68629.  
Last modified: 27 Mar 2023, 0:28:24 UTC

I have no idea how to make this work. I tried to set up an instance, and was told after I'd filled everything in, I couldn't use Windows in the free trial. Fine, I made a new instance with Ubuntu, then it tells me (right at the end) the free trial is limited to 32 CPU cores. Fine, start again! 32 cores on Ubuntu. No, now I can only use 8?!

So I ask on the Google Cloud Community, and it won't accept any username I choose. Keeps saying invalid characters even though I just use some lower case letters.

Anyone any idea how to make it work at all?
ID: 68630 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 68632 - Posted: 31 Mar 2023, 15:43:08 UTC

I had a 24 core per region quota limit, though I've not tried to raise it. I was able to just create some 4/8/22 core VMs and they're purring away nicely.

Rosetta@Home is out of tasks, though there are still some retries going around.

I suppose next steps to run out of work would be Folding@Home CPU or maybe Einstein@Home, though I don't particularly find pulsars/neutron stars/etc particularly interesting projects, and certainly not of much "Earthly importance."
ID: 68632 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 68633 - Posted: 31 Mar 2023, 22:28:26 UTC - in response to Message 68632.  

I had a 24 core per region quota limit, though I've not tried to raise it. I was able to just create some 4/8/22 core VMs and they're purring away nicely.

Rosetta@Home is out of tasks, though there are still some retries going around.

I suppose next steps to run out of work would be Folding@Home CPU or maybe Einstein@Home, though I don't particularly find pulsars/neutron stars/etc particularly interesting projects, and certainly not of much "Earthly importance."


The next project to run out of work will be TN-Grid in about 10 days time :-(

Of my 5 projects that will be 4 of them without work.
ID: 68633 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68647 - Posted: 13 Apr 2023, 9:37:25 UTC - in response to Message 68632.  
Last modified: 13 Apr 2023, 9:37:59 UTC

I had a 24 core per region quota limit, though I've not tried to raise it. I was able to just create some 4/8/22 core VMs and they're purring away nicely.

Rosetta@Home is out of tasks, though there are still some retries going around.

I suppose next steps to run out of work would be Folding@Home CPU or maybe Einstein@Home, though I don't particularly find pulsars/neutron stars/etc particularly interesting projects, and certainly not of much "Earthly importance."
You could try Sidock, Denis, QuChempedia, World Community Grid, or Asteroids incase one hits us!
ID: 68647 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 68648 - Posted: 13 Apr 2023, 9:38:58 UTC

It seems subscriptions aren't being emailed out. Who's in charge of the email server?
ID: 68648 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Cafe CPDN : World Community Grid mostly down for 2 months while transitioning

©2024 cpdn.org