climateprediction.net (CPDN) home page
Thread 'New work discussion - 2'

Thread 'New work discussion - 2'

Message boards : Number crunching : New work discussion - 2
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 42 · Next

AuthorMessage
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,708,278
RAC: 9,361
Message 66209 - Posted: 21 Oct 2022, 21:17:15 UTC - in response to Message 66206.  

Fair enough. It's useful to set the <cpu_sched> event log flag, so you can see exactly what, and how much, is being requested.
ID: 66209 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 66212 - Posted: 22 Oct 2022, 9:52:42 UTC - in response to Message 66209.  

I was playing with the event log options. <work_fetch_debug> gave alot more information about whether projects could or couldn't request work.

I wonder whether one approach would be to run two clients on the same host. One solely for the CPDN projects and another for everything else. That way, the CPDN client would always be free. Might depend on whether the server was smart enough to spot that a single hostname & IP was running more than 1 client and still treat them as one. Still, might be worth a try and straightforward to do. Probably been thought of before.
ID: 66212 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,708,278
RAC: 9,361
Message 66213 - Posted: 22 Oct 2022, 11:57:45 UTC - in response to Message 66212.  

Probably been thought of before.
Yes, it has. I have a working installation for reference and occasional testing, but that's under Windows. Trying to get two different Linux services running together would be a different kettle of fish. I can send you my sample files, and explain why the various elements are needed, if it would help.
ID: 66213 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 66215 - Posted: 22 Oct 2022, 12:54:31 UTC - in response to Message 66213.  
Last modified: 22 Oct 2022, 13:15:22 UTC

Trying to get two different Linux services running together would be a different kettle of fish. I can send you my sample files, and explain why the various elements are needed, if it would help.
Are you referring to two BOINC instances? I do it all the time on Ubuntu. I find it easier to set up than on Windows, or at least to start up both instances.
https://www.overclock.net/threads/guide-setting-up-multiple-boinc-instances.1628924/
Then, I use BoincTasks to manage both instances (on different ports) on Ubuntu machines from my main Windows machine.

I have used it when one BOINC instance is on CPDN, and the other instance on another project. I have not tried both instances on CPDN, but it probably will work.
(I had a problem on one project a few years ago when both instances were on the same project, but not recently. I think the current BOINC server version works OK.)
ID: 66215 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 66218 - Posted: 23 Oct 2022, 17:52:34 UTC - in response to Message 66215.  

Are you referring to two BOINC instances? I do it all the time on Ubuntu. I find it easier to set up than on Windows, or at least to start up both instances.
I see I'm playing catchup here, thanks for the input. It was straightforward to get two client instances running, one specifically for CPDN which will make it easier when debugging issues with OpenIFS. Getting correct systemctl startup files for both clients & boincmgr working took some time but it all works now.
ID: 66218 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,024,725
RAC: 20,592
Message 66219 - Posted: 23 Oct 2022, 20:18:50 UTC

I would love to see the CPDN server logs. There were 25 linux tasks in the 'dev' queue yesterday, but despite suspending all my projects/tasks except cpdn on my linux box and pinging the server every minute, it refused to give me any. User geophi seemed the only one getting them. Then I booted up my WSL boinc instance which only has cpdn & cpdn-dev as projects and it sent me one. So I can't help but wonder if the server gets jealous of other projects :) Maybe someone who understands the server scheduling algorithm might know what's going on.
If you are talking about the hadcm3s tasks, those were Mac only, the linux ones having been deprecated.
ID: 66219 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 66220 - Posted: 24 Oct 2022, 0:41:26 UTC - in response to Message 66219.  

They were HADsm, the slab ocean version which is Linux. I know the coupled model is Mac.
ID: 66220 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66221 - Posted: 24 Oct 2022, 7:24:42 UTC - in response to Message 66205.  

That's your problem. If it was the main CPDN project (not the dev site), the server asks you to wait for 1 hour plus 1% - 3636 seconds - between updates.
Why the 1%?
ID: 66221 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66222 - Posted: 24 Oct 2022, 7:26:44 UTC - in response to Message 66191.  

The plan I believe is to run these experiments end Oct/start Nov. Which is when I hope I'll have some tests for the multi-core, high-resolution models ready to go too.
Does that include virtualbox ones for Windows?
ID: 66222 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,708,278
RAC: 9,361
Message 66223 - Posted: 24 Oct 2022, 8:07:59 UTC - in response to Message 66221.  

That's your problem. If it was the main CPDN project (not the dev site), the server asks you to wait for 1 hour plus 1% - 3636 seconds - between updates.
Why the 1%?
Just in case there are minor discrepancies - e.g. rounding errors - between the clocks running on the server and the user's home computer. All clocks are slightly different, but are re-synced periodically to an authoritative time server. Which is itself periodically adjusted with leap seconds. You wouldn't want work fetch to be denied because of those discrepancies.
ID: 66223 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,708,278
RAC: 9,361
Message 66224 - Posted: 24 Oct 2022, 8:18:05 UTC

BTW, while we're here - which exactly is the current dev site? Over the years, we've used several:



I must have accounts on all of them...
ID: 66224 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66225 - Posted: 24 Oct 2022, 8:25:04 UTC - in response to Message 66223.  

That's your problem. If it was the main CPDN project (not the dev site), the server asks you to wait for 1 hour plus 1% - 3636 seconds - between updates.
Why the 1%?
Just in case there are minor discrepancies - e.g. rounding errors - between the clocks running on the server and the user's home computer. All clocks are slightly different, but are re-synced periodically to an authoritative time server. Which is itself periodically adjusted with leap seconds. You wouldn't want work fetch to be denied because of those discrepancies.
Ah, I assumed "wait 1 hour" meant "wait 1 hour from your current time on your clock". If I told you to meet me in an hour, you'd meet me 1 hour later by your own watch, I haven't even told you what my watch reads.
ID: 66225 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 66226 - Posted: 24 Oct 2022, 8:38:21 UTC - in response to Message 66224.  

Richard
Sent a PM.
ID: 66226 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66227 - Posted: 24 Oct 2022, 8:45:29 UTC - in response to Message 66226.  
Last modified: 24 Oct 2022, 8:45:51 UTC

Richard
Sent a PM.
You remind me of colleagues who would stop me in the corridor and ask if I'd got their email. Surely he'll see the PM just as likely as seeing this message?
ID: 66227 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,708,278
RAC: 9,361
Message 66228 - Posted: 24 Oct 2022, 8:52:27 UTC - in response to Message 66227.  

Yes, saw the email notification of the PM before I saw the reply here, but some projects have broken mail servers and the web notification is very discrete - I sometimes miss them.

Anyway, I did already have an account, and I'm back in - ready for those IFS tasks. I've also upgraded the memory on my Linux machines to 64 GB / 32 GB.
ID: 66228 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,024,725
RAC: 20,592
Message 66229 - Posted: 24 Oct 2022, 9:13:04 UTC - in response to Message 66220.  

They were HADsm, the slab ocean version which is Linux. I know the coupled model is Mac.

Was away so didn't even notice them and they must have been short enough that there was nothing about them under users in last 24 hours on server status page last night.
ID: 66229 · Report as offensive
ProfileConan
Avatar

Send message
Joined: 6 Jul 06
Posts: 147
Credit: 3,615,496
RAC: 420
Message 66230 - Posted: 24 Oct 2022, 11:04:06 UTC

I would like to join the cpdnboinc-dev project to help out but it appears to need an invitation code and it's not mentioned anywhere that I could see.

I think I tried a long while back but could not get a leg in, but memory a bit fuzzy about that.

My Linux computer was upgraded a while back to 64 GB RAM (for other BOINC projects requiring 1 or more GB of RAM per work unit, with 32 GB and 24 threads I was maxing out my memory).

An invitation would be nice but perhaps they might have enough testers? So I might still not get in.

Thanks
Conan
ID: 66230 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66231 - Posted: 24 Oct 2022, 11:23:33 UTC - in response to Message 66230.  

Me too, but I can only provide 8 Windows machines.
ID: 66231 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 66232 - Posted: 24 Oct 2022, 11:56:47 UTC - in response to Message 66230.  

Sorry Conan, the Dev site has very restricted access.
ID: 66232 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 66233 - Posted: 24 Oct 2022, 12:11:01 UTC - in response to Message 66232.  

Sorry Conan, the Dev site has very restricted access.
It's also quite oversubscribed, I rarely get dev tasks. There's also no credit and the risk of getting misconfigired workunits that can disrupt the client (eg wrong memory settings)
ID: 66233 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 42 · Next

Message boards : Number crunching : New work discussion - 2

©2024 cpdn.org