climateprediction.net home page
Run Linux work units with Windows 10 WSL

Run Linux work units with Windows 10 WSL

Questions and Answers : Unix/Linux : Run Linux work units with Windows 10 WSL
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63462 - Posted: 2 Feb 2021, 17:31:11 UTC

To my utter astonishment, it is possible to run the Linux work units (e.g., N216) under Windows 10 WSL (Windows Subsystem for Linux) without VirtualBox or any other VM. I think it uses Hyper-V, but that is irrelevant to me.
How practical it is is another matter; you don't have a GUI, and have to use the BOINC command lines (boinccmd). And while I have it running, I am not sure how to get it to survive a reboot. Maybe the Linux experts can help. But in BoincTasks, it is even recognized:
WSL detected:
[Ubuntu-20.04] (default): Linux Ubuntu (Ubuntu 20.04.2 LTS [5.4.72-microsoft-standard-WSL2])

I think Richard Haselgrove had something to do with that, and maybe he can chime in.

The basic procedures are:

(1) Install WSL 2 (you need a recent Win10 version):
https://docs.microsoft.com/en-us/windows/wsl/install-win10
(2) Select the Linux version you want (I used Ubuntu 20.04).
(3) Open a command window by opening "ubuntu2004.exe" (located in C:\Users\
(Username)\AppData\Local\Microsoft\WindowsApps\CanonicalGroupLimited.Ubuntu20.04onWindows_79rhkp1fndgsc

(4) Install the 32-bit libraries for your version of Linux (sudo apt install lib32ncurses6 lib32z1 lib32stdc++-7-dev)
(5) Install BOINC: "sudo apt install boinc"
(6) Start BOINC by just typing "boinc" in a first window, and then, without closing it,
(7) Open a second command window and execute: boinccmd --project_attach https://climateprediction.net/ xxxxyyyyzzzz,
where xxxxyyyyzzzz is your Account Key (https://www.cpdn.org/weak_auth.php).

It is running now. I need to polish it up a bit to make it more practical.
https://www.cpdn.org/show_host_detail.php?hostid=1514666
ID: 63462 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2183
Credit: 64,822,615
RAC: 5,275
Message 63468 - Posted: 2 Feb 2021, 19:04:48 UTC - in response to Message 63462.  

I wondered if anyone was doing that. Microsoft is supposed to be bringing an integrated way of running Linux GUI apps in WSL. I'll be interested to see if it plays nice with boinc.

https://medium.com/for-linux-users/gui-linux-on-windows-is-coming-7c199b15b72
ID: 63468 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63470 - Posted: 2 Feb 2021, 21:55:31 UTC - in response to Message 63468.  
Last modified: 2 Feb 2021, 22:01:29 UTC

It looks like shutting the boinc window, followed by a reboot breaks the work unit and does not allow it to pick up where it left off when BOINC is restarted.
That is not surprising, but limits the use of it in a Windows desktop environment.

I have asked on the BoincTasks forum whether it could be adapted to manage BOINC running under WSL. That would be convenient.

EDIT: It does come back to life after a few minutes of inactivity, but only by downloading a new work unit. It seems to be running OK now, but any prior work was lost..
https://www.cpdn.org/workunit.php?wuid=12063595
Unfortunately I can't run this machine that way, and will have to detach.
ID: 63470 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2183
Credit: 64,822,615
RAC: 5,275
Message 63474 - Posted: 3 Feb 2021, 0:57:48 UTC - in response to Message 63470.  

What if you ran in another terminal window

boinccmd --quit

before you close the window you ran the boinc command in? Will that cleanly exit boinc?
ID: 63474 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63477 - Posted: 3 Feb 2021, 2:58:56 UTC - in response to Message 63474.  
Last modified: 3 Feb 2021, 3:16:15 UTC

That works better. I get an "Exiting" line after running boinccmd --quit.
Then, if I start BOINC again, it seems to start up again OK.

Finally, I just closed the BOINC window again without the --quit, and it killed the work unit again.

I will try more tomorrow after I get another work unit after the 1 hour delay.
Maybe someone can automate this to allow for graceful reboots?

EDIT: Before we get too far down the road, I can foresee the next problem. Even though I have the resource share set to "0", the Linux work units of course don't know about the Windows ones I am running on 10 cores of my Ryzen 3600 (plus reserving a core to support the GPU). So it is likely that I will see a whole bunch of Linux N216 work units that I can't really use. I want only one. But I will have to wait and see how it handles that.
ID: 63477 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2183
Credit: 64,822,615
RAC: 5,275
Message 63478 - Posted: 3 Feb 2021, 6:22:14 UTC - in response to Message 63477.  
Last modified: 3 Feb 2021, 6:23:14 UTC

Maybe you can create a global preferences override file and limit the CPU usage to 1 core in Linux if that's all your going to run? If you are going to run other projects as well in Linux, that wouldn't work well.

https://boinc.berkeley.edu/wiki/Global_prefs_override.xml

Or possibly use app_config.xml to limit the number of work units downloaded/running for cpdn.

https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration
ID: 63478 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63480 - Posted: 3 Feb 2021, 6:51:40 UTC - in response to Message 63478.  

Maybe you can create a global preferences override file and limit the CPU usage to 1 core in Linux if that's all your going to run?

I may be OK. I had limited the CPUs to 10%, meaning that only one work unit would run at a time on my 12-core Ryzen 3600.
And it has been over two hours since downloading the work unit that is running, so that should have been time for two more. But I haven't received any more.
(And I keep the default buffer of 0.1 + 0.5 days).

It probably accomplishes the same thing that the global preferences do, but I could try that if needed.
I don't know about the app_config. I think that just limits the ones that are running, but does not affect the downloads.

But I have no estimate of completion time without any manager. I will have to rely on the trickles in a couple of days, if I can keep it going that long.
Thanks.
ID: 63480 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2183
Credit: 64,822,615
RAC: 5,275
Message 63481 - Posted: 3 Feb 2021, 7:28:57 UTC - in response to Message 63480.  

boinccmd --get_tasks will give you a text listing of the tasks, their CPU time so far (in seconds), estimated time to completion (also in seconds), and percent done, among other things.
ID: 63481 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63482 - Posted: 3 Feb 2021, 12:58:12 UTC - in response to Message 63481.  
Last modified: 3 Feb 2021, 12:59:49 UTC

boinccmd --get_tasks will give you a text listing of the tasks, their CPU time so far (in seconds), estimated time to completion (also in seconds), and percent done, among other things.

Wonderful. But I am not quite sure it gives consistent results, though before my first cup of coffee I won't vouch for anything.
But the fraction done (after 9 hours) suggests that the total time is less than 9 days.
If that is accurate, it is faster than on an Ubuntu 20.04 machine itself.

And on the Windows side, I have 10 WCG/OPN running (plus a GPU on Folding).
They don't seem much disturbed either.

name: hadam4h_a14o_209511_5_895_012064086_0
WU name: hadam4h_a14o_209511_5_895_012064086
project URL: https://climateprediction.net/
received: Tue Feb 2 23:03:25 2021
report deadline: Sun Jan 16 04:23:29 2022
ready to report: no
state: downloaded
scheduler state: scheduled
active_task_state: EXECUTING
app version num: 852
resources: 1 CPU
estimated CPU time remaining: 6610868.991256
slot: 0
PID: 283
CPU time at last checkpoint: 30892.340000
current CPU time: 31449.630000
fraction done: 0.043805
swap size: 1378 MB
working set size: 1364 MB
ID: 63482 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63483 - Posted: 3 Feb 2021, 13:58:05 UTC

Here is the word from fred on BoincTasks:
https://forum.efmer.com/index.php?topic=1974.0
ID: 63483 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2183
Credit: 64,822,615
RAC: 5,275
Message 63484 - Posted: 3 Feb 2021, 14:06:21 UTC - in response to Message 63482.  

It says you've run 8+ hours, are 4.4% done, and have 1800 hours left. Whereas, doing the math, it should take about 200 hours total. The boinc estimate for time remaining is not a good estimate I would think. Benchmarks haven't been run, so use boinccmd to run them

boinccmd --run_benchmarks

boinc may be using the low benchmark scores to give time to completion estimates since your FP benchmark is the default of 1 billion ops/sec and it should be 5+ billion. It probably won't change the estimates much for this task, but for future tasks it should give a better estimate.
ID: 63484 · Report as offensive     Reply Quote
lazlo_vii

Send message
Joined: 11 Dec 19
Posts: 108
Credit: 3,012,142
RAC: 0
Message 63486 - Posted: 3 Feb 2021, 18:26:17 UTC - in response to Message 63484.  

Here is the manual page for boinccmd:

https://boinc.berkeley.edu/wiki/Boinccmd_tool

That might help you get the most out of it. Another thought is editing/creating the remote_hosts.cfg and adding your windows IP address and computer name e.g.:

# This file contains a list of hostnames or IP addresses (one per line)
# of remote hosts, that are allowed to connect and to control the local
# BOINC core client via GUI RPCs.
# Lines beginning with a # or a ; are treated like comments and will be
# ignored.
#
#host.example.com
#192.168.0.180
192.168.1.2
desktop-pc


Then you should be able to point the BOINC GUI to what ever IP address WSL is using.
ID: 63486 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63487 - Posted: 3 Feb 2021, 19:22:37 UTC - in response to Message 63484.  

Benchmarks haven't been run, so use boinccmd to run them

boinccmd --run_benchmarks

I will certainly do that when this one is complete. I don't want to touch anything now.
ID: 63487 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63488 - Posted: 3 Feb 2021, 19:31:21 UTC - in response to Message 63486.  
Last modified: 3 Feb 2021, 19:48:07 UTC

That might help you get the most out of it. Another thought is editing/creating the remote_hosts.cfg and adding your windows IP address and computer name e.g.:
Actually, I got rid of the remote_hosts.cfg file on my Linux machines so that I can reach any of them without restrictions.


I like your idea of just pointing the BOINC GUI to it. In my cc_config.xml file, I normally put:
<allow_remote_gui_rpc>1</allow_remote_gui_rpc>
though I had not gotten around to putting the cc_config in yet.
It might work, after this one is finished and I can reboot.

EDIT: It looks like the default port for the BOINC GUI is 31416. I am presently using that one for my Windows version of BOINC. I will probably have to chose a different one for the Linux version, since they are both on the same machine.
But it should work. Thanks.
ID: 63488 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63489 - Posted: 3 Feb 2021, 21:15:38 UTC

Remarkable. It all worked.

I took the risk of shutting down properly (after including the cc_config.xml), and rebooted.
Then I started up again and directed the GUI to port 31418:
boinc --gui_rpc_port 31418

Finally, I added that to BoincTasks, and I see it (along with the Windows side).
Also, I ran the CPU benchmarks from there, and so it should be better estimates eventually.

Now I just need to figure out how to reboot without having to shut down, for best reliability.

Thanks to all.
ID: 63489 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63496 - Posted: 6 Feb 2021, 7:18:47 UTC
Last modified: 6 Feb 2021, 7:19:09 UTC

I needed to reboot, so I suspended the work unit with BoincTasks (at a checkpoint no less), thinking that would make it safe for a reboot.
But it wasn't. The work unit errored out upon restart.

So you have to use "boinccmd --quit" to shut it down safely.
That will make it difficult to use.
ID: 63496 · Report as offensive     Reply Quote
lazlo_vii

Send message
Joined: 11 Dec 19
Posts: 108
Credit: 3,012,142
RAC: 0
Message 63497 - Posted: 6 Feb 2021, 10:36:38 UTC - in response to Message 63496.  

I needed to reboot, so I suspended the work unit with BoincTasks (at a checkpoint no less), thinking that would make it safe for a reboot.
But it wasn't. The work unit errored out upon restart.

So you have to use "boinccmd --quit" to shut it down safely.
That will make it difficult to use.



You could try issuing several "sync" commands in a row before the reboot. I have no idea how Win10 will handle it (It might even just ignore it) but it may be worth a try. I have RAID array that does a constant 2-6 megabytes per second in writes for WCG OPN1 tasks and no matter what I tried CPDN tasks failed on every reboot because the work doesn't get written to disk before the system restarts. If WSL the kind of containerization that used your CPU's virtualization instructions you might have the same problem I do.
ID: 63497 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63499 - Posted: 6 Feb 2021, 14:51:40 UTC - in response to Message 63497.  

Suspending at a checkpoint should write all the files to disk I would think. You might as well just issue the "boinccmd --quit", which works anyway.
But I really need an automated way to do it for each shutdown, not issue a command.

However, it was running well. My first trickle was 16.6457 sec/TS, and that was with 10 OPN plus a GPU on the Windows side.
I think I will try two N216 and 9 OPN and see how they run.
ID: 63499 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63502 - Posted: 7 Feb 2021, 13:55:01 UTC
Last modified: 7 Feb 2021, 13:55:37 UTC

Although WSL runs very efficiently, it is really not for the ordinary user, but for developers who use command-line tools.
https://docs.microsoft.com/en-us/windows/wsl/faq

To run BOINC, it is probably better to use the procedure outlined on Universe by rsNeutrino (starting at message #4565)
https://universeathome.pl/universe/forum_thread.php?id=551

The newest Ubuntu version is Ubuntu 20.04.2 (https://ubuntu.com/download/desktop).
It probably runs as fast, since that method is based on Hyper-V also.
ID: 63502 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63514 - Posted: 8 Feb 2021, 13:34:24 UTC - in response to Message 63502.  

The first two trickles are in when running two N216 at a time, and they average around 17.4 sec/TS.
https://www.cpdn.org/result.php?resultid=22016583
https://www.cpdn.org/result.php?resultid=21996625

That is very similar to an identical Ryzen 3600 running Ubuntu 20.04.1 directly, except that it had Rosetta (plus a GPU) on the other cores rather than OPN.
https://www.cpdn.org/result.php?resultid=21997198
https://www.cpdn.org/result.php?resultid=22013894

So I don't think you will see any loss running the Linux work units on a Windows machine. It is just a question of how to manage it best.
ID: 63514 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Questions and Answers : Unix/Linux : Run Linux work units with Windows 10 WSL

©2024 cpdn.org