Thread 'World Community Grid mostly down for 2 months while transitioning'

Author	Message
SolarSyonyk Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463	Message 68568 - Posted: 9 Mar 2023, 21:54:34 UTC sigh WCG has been hard-down again, for some hard drive failure issue or another. https://twitter.com/WCGrid Before that, they were dealing with severe shortages of WUs, because the scripts tried to predict how much work would be done, based on past performance. Except, if past performance was low due to lack of WUs to send, it would predict low future performance and not generate many WUs. I'm doing some video transcode on my boxes right now, but once that's done, I suppose they go into shut down archive mode and I heat on resistors. ID: 68568 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68569 - Posted: 9 Mar 2023, 22:11:32 UTC - in response to Message 68568. sigh WCG has been hard-down again, for some hard drive failure issue or another. https://twitter.com/WCGrid Before that, they were dealing with severe shortages of WUs, because the scripts tried to predict how much work would be done, based on past performance. Except, if past performance was low due to lack of WUs to send, it would predict low future performance and not generate many WUs. I'm doing some video transcode on my boxes right now, but once that's done, I suppose they go into shut down archive mode and I heat on resistors. There are so many other useful things you can run. Assuming you mean CPUs, Rosetta just got 9 million tasks in. Sidock's got a load of longer tasks. ID: 68569 · Reply Quote

SolarSyonyk Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463	Message 68570 - Posted: 9 Mar 2023, 22:23:04 UTC - in response to Message 68569. Last modified: 9 Mar 2023, 22:23:46 UTC There are so many other useful things you can run. Assuming you mean CPUs, Rosetta just got 9 million tasks in. Sidock's got a load of longer tasks. Yeah, all I have are CPUs. I just don't like running "math for the sake of doing compute" sort of projects like prime finders or such. Where are you seeing 9M Rosetta tasks? http://boinc.bakerlab.org/server_status.php argues a lot lower, though I suppose I can toss their stuff in my mix for when everything else is out. ID: 68570 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68571 - Posted: 9 Mar 2023, 23:47:08 UTC - in response to Message 68570. Last modified: 9 Mar 2023, 23:47:55 UTC Yeah, all I have are CPUs. I just don't like running "math for the sake of doing compute" sort of projects like prime finders or such. Agreed. But there's loads of biology and physics to do. If you can run VB stuff, go on LHC and discover subatomic particles. They're why we have MRI scanners. Where are you seeing 9M Rosetta tasks? http://boinc.bakerlab.org/server_status.php argues a lot lower, though I suppose I can toss their stuff in my mix for when everything else is out. Unlike most projects, they're honest and admit the real full queue. It's on the home page. Server status just shows the RAM buffer. No idea why projects choose to hide a key piece of information from their volunteers. ID: 68571 · Reply Quote

AndreyOR Send message Joined: 12 Apr 21 Posts: 317 Credit: 14,925,468 RAC: 12,903	Message 68572 - Posted: 10 Mar 2023, 5:26:24 UTC sigh WCG has been hard-down again, for some hard drive failure issue or another. To say that WCG has struggled since leaving IBM would be an understatement. Some projects that have plenty of CPU work that I find interesting enough to run, and are not just pure math projects, are: Rosetta, Einstein, Universe, MilkyWay, LHC. Yes, like mentioned above, Rosetta is a bit unique in that on the home page they have a stats section that lists, among other things, total queued jobs. The traditional server status page just shows the buffer from which clients get tasks, they seem to keep it at around 30k. Einstein also has that kind of info on their server status page broken down by sub-project although that section sometimes disappears. ID: 68572 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154	Message 68574 - Posted: 10 Mar 2023, 6:36:15 UTC - in response to Message 68569. Rosetta just got 9 million tasks in. Rosetta has only this many at the moment. Tasks in progress 329539 Tasks ready to send 29905 ID: 68574 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68575 - Posted: 10 Mar 2023, 6:48:06 UTC - in response to Message 68574. Last modified: 10 Mar 2023, 6:49:11 UTC Rosetta just got 9 million tasks in. Rosetta has only this many at the moment. Tasks in progress 329539 Tasks ready to send 29905 Incorrect, on the main page: The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding? ID: 68575 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944	Message 68577 - Posted: 10 Mar 2023, 10:29:19 UTC - in response to Message 68575. The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding? Can't comment on other projects, especially those I have had little involvement with but CPDN only updates the server status page every couple of hours which means small batches of work can be gone before they show up on the page. ID: 68577 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68578 - Posted: 10 Mar 2023, 11:25:27 UTC - in response to Message 68577. The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding? Can't comment on other projects, especially those I have had little involvement with but CPDN only updates the server status page every couple of hours which means small batches of work can be gone before they show up on the page. This is common - the updating frequency is low on other projects too. Rosetta is odd, since the updating time of server status and the updating time of the main page are different, so one is usually ahead of the other, but not always the same one. ID: 68578 · Reply Quote

alanb1951 Send message Joined: 31 Aug 04 Posts: 37 Credit: 9,581,380 RAC: 3,853	Message 68583 - Posted: 10 Mar 2023, 19:03:12 UTC - in response to Message 68578. The server status page is a poor version of this. No idea why they keep it there at all. no idea why some projects don't have an honest queue displayed. What are they hiding? Can't comment on other projects, especially those I have had little involvement with but CPDN only updates the server status page every couple of hours which means small batches of work can be gone before they show up on the page. This is common - the updating frequency is low on other projects too. Rosetta is odd, since the updating time of server status and the updating time of the main page are different, so one is usually ahead of the other, but not always the same one. The standard BOINC server status page can only see work that has been prepared for sending out, as it can only see the BOINC database. That's probably the most honest indication of available work for most projects, since the total amount of work may be indeterminate for one reason or another! (And you can't hide what you don't know...) There are some projects out there that have an exactly known number of work units (if all goes well) -- individual batches here at CPDN may be examples, ARP1 sub-project at WCG is another¹, and I'm sure there are many more. However, more common seem to be projects such as MilkyWay (both sub-projects) or WCG projects such as MCM1, OPN1/OPNG and SCC1² that run until some target is hit³, whilst there may also be genuinely open-ended ones (lots of "mathematical" projects?) I'd actually be interested to know where Rosetta gets that very large number from (and its likely accuracy) -- it's almost certainly not coming from anything in the BOINC database itself, otherwise it could possibly update at the same time(s) as the server status page :-) Cheers - Al. P.S. I wonder if knowing how much work is available long-term is only of major interest to badge-hunters? The clamour at WCG when a project was known to be nearing its end used to be quite something to behold... :-) ¹ The only argument about the number of work units was whether the year of data (at two days per "generation") would have 364 days or 366 -- a difference of 35609 work units! ² In the case of SCC1 there has been such a long hiatus that many think it won't return, but childhood cancers are a key research area so I mention it anyway... ³ Those projects process data for a given "target" for as long as the scientists deem appropriate (e.g. MilkyWay runs a set of streams "until converged"), then they move on to another target. ID: 68583 · Reply Quote

SolarSyonyk Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463	Message 68584 - Posted: 10 Mar 2023, 20:05:00 UTC - in response to Message 68571. If you can run VB stuff, go on LHC and discover subatomic particles. They're why we have MRI scanners. I'll have to see, I don't have a huge amount of bandwidth and most of my compute machines are RAM starved if it's running a VM per core - I built for around 1.5GB/core, and that's been a limit, though I'm hitting RAM bandwidth/latency/cache limits anyway before I run out of RAM, based on overall system throughput measurements. That Rosetta seems to have useful work is good, I'll let some of that run. I think they got run out of WUs some while back during Covid, didn't they? ID: 68584 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68586 - Posted: 11 Mar 2023, 2:50:01 UTC - in response to Message 68583. I'd actually be interested to know where Rosetta gets that very large number from (and its likely accuracy) -- it's almost certainly not coming from anything in the BOINC database itself, otherwise it could possibly update at the same time(s) as the server status page :-) The scientists presumably create a huge batch of several million targets and put them in a folder somewhere on the server. The server keeps track of how many are left. The actual Boinc server presumably can't handle that many, so they're spoon fed to the useless POS software a few thousand at a time. P.S. I wonder if knowing how much work is available long-term is only of major interest to badge-hunters? The clamour at WCG when a project was known to be nearing its end used to be quite something to behold... :-) I don't hunt badges, I'm doing it for the good of mankind. But it's nice to know how much work is available at a project so I can choose which other ones to run. ID: 68586 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68587 - Posted: 11 Mar 2023, 2:56:40 UTC - in response to Message 68584. I'll have to see, I don't have a huge amount of bandwidth LHC's CMS uses a lot of bandwidth. I have 122 CPU cores, but only 7Mbit upload. To run all that on LHC's CMS, I'd need three times the bandwidth, about 20Mbit. But the other subprojects there are not bandwidth hungry. and most of my compute machines are RAM starved if it's running a VM per core - I built for around 1.5GB/core, and that's been a limit, though I'm hitting RAM bandwidth/latency/cache limits anyway before I run out of RAM, based on overall system throughput measurements. LHC's ATLAS tasks can run on up to 8 cores. I think the formula is something like 2GB per task + 1GB per core. So you can run 10GB tasks on 8 cores, which should be within your amount. That Rosetta seems to have useful work is good, I'll let some of that run. I think they got run out of WUs some while back during Covid, didn't they? They had an enormous batch of python apps running, which not only required VB but also a modern CPU with AVX, so not many people were able to run them. The trouble was they tried to run on computers that couldn't run them, and there was a mess. It took a long conversation between me and several other users to determine which instruction was required in the CPU to run them without failing. The other problem was no admins in the forum to listen. No way to contact anyone at all. There is now one scientist in there though. Steven Rettie I think. And their currnet tasks are not VB, although they're still working on fixing the bugs in the VB one so it may be back in full force soon. ID: 68587 · Reply Quote

SolarSyonyk Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463	Message 68628 - Posted: 26 Mar 2023, 22:28:11 UTC Well, those 9M Rosetta tasks have been disappearing at a rate of around 500k/day. Down to 1.5M left... not much more work to be found there, unless they've got new stuff coming. I suppose I'll slosh compute over to F@H and help run them out of tasks afterwards... or get off my rear and get some more video transcode work lined up for my boxes. I've got a bunch of the Google Compute Engine C3 beta machines (they're free while in beta!) chewing on R@H and F@H too right now. ;) Something like 44 high end Intel CPU cores... ID: 68628 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68629 - Posted: 26 Mar 2023, 23:47:10 UTC - in response to Message 68628. Can't see "beta", but I have found "free trial" with $400 credit. I've signed up and will attempt to use it. ID: 68629 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68630 - Posted: 27 Mar 2023, 0:16:33 UTC - in response to Message 68629. Last modified: 27 Mar 2023, 0:28:24 UTC I have no idea how to make this work. I tried to set up an instance, and was told after I'd filled everything in, I couldn't use Windows in the free trial. Fine, I made a new instance with Ubuntu, then it tells me (right at the end) the free trial is limited to 32 CPU cores. Fine, start again! 32 cores on Ubuntu. No, now I can only use 8?! So I ask on the Google Cloud Community, and it won't accept any username I choose. Keeps saying invalid characters even though I just use some lower case letters. Anyone any idea how to make it work at all? ID: 68630 · Reply Quote

SolarSyonyk Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463	Message 68632 - Posted: 31 Mar 2023, 15:43:08 UTC I had a 24 core per region quota limit, though I've not tried to raise it. I was able to just create some 4/8/22 core VMs and they're purring away nicely. Rosetta@Home is out of tasks, though there are still some retries going around. I suppose next steps to run out of work would be Folding@Home CPU or maybe Einstein@Home, though I don't particularly find pulsars/neutron stars/etc particularly interesting projects, and certainly not of much "Earthly importance." ID: 68632 · Reply Quote

Bryn Mawr Send message Joined: 28 Jul 19 Posts: 150 Credit: 12,830,559 RAC: 228	Message 68633 - Posted: 31 Mar 2023, 22:28:26 UTC - in response to Message 68632. I had a 24 core per region quota limit, though I've not tried to raise it. I was able to just create some 4/8/22 core VMs and they're purring away nicely. Rosetta@Home is out of tasks, though there are still some retries going around. I suppose next steps to run out of work would be Folding@Home CPU or maybe Einstein@Home, though I don't particularly find pulsars/neutron stars/etc particularly interesting projects, and certainly not of much "Earthly importance." The next project to run out of work will be TN-Grid in about 10 days time :-( Of my 5 projects that will be 4 of them without work. ID: 68633 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68647 - Posted: 13 Apr 2023, 9:37:25 UTC - in response to Message 68632. Last modified: 13 Apr 2023, 9:37:59 UTC I had a 24 core per region quota limit, though I've not tried to raise it. I was able to just create some 4/8/22 core VMs and they're purring away nicely. Rosetta@Home is out of tasks, though there are still some retries going around. I suppose next steps to run out of work would be Folding@Home CPU or maybe Einstein@Home, though I don't particularly find pulsars/neutron stars/etc particularly interesting projects, and certainly not of much "Earthly importance." You could try Sidock, Denis, QuChempedia, World Community Grid, or Asteroids incase one hits us! ID: 68647 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 68648 - Posted: 13 Apr 2023, 9:38:58 UTC It seems subscriptions aren't being emailed out. Who's in charge of the email server? ID: 68648 · Reply Quote