Questions and Answers :
Unix/Linux :
New work for new machine?
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Jun 06 Posts: 20 Credit: 1,349,578 RAC: 0 |
My server had been running Gentoo, and in trying to go back to Debian (and soon to Devuan), problems came up which seemed to be better handled by retiring this old machine (which had 8 GB RAM on a dual AMD 4800+ CPU). The new machine (well, motherboard and CPU) is now an 8 core FX-8320E with 16GB of RAM, and I set aside 20 GB of disk for BOINC. And this new machine also has an AMD graphics board with single precision GPU (OpenCL) support. Boinc didn't seem to want to start with the same stuff that was here prior to the upgrades, so I had to start anew. In connecting to ClimatePrediction.net I got a message that the project might not have any work for this kind of machine. Yes, at the moment I am only allowing a single CPU to spend all day, every day doing BOINC. But what is it that ClimatePrediction is looking for, if a 16 GB machine on a recent 8 core processor with 20 GB of disk isn't useful? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It's looking for 32 bit files and libs. There are several threads about this, the main one being the sticky at the top of the section. But you may have to dig deep to get to the good stuff. :) |
Send message Joined: 28 Jun 06 Posts: 20 Credit: 1,349,578 RAC: 0 |
I looked (a little) in the top thread. Obviously not far enough. Debian is multiarch by default, at least for Intel/AMD type stuff. I haven't done anything to restrict 32 bit from this machine. I have played a couple of times with the virtual machines Devuan seems to prefer, which come from Hashicorp/Atlas. Does a person need to set up something that way? Or just wait for 64 bit stuff to come along? My plans, were to run for something like a week with just a single core doing BIONC, then add one core, and so on. As this is an 8 core machine with only 4 floating point units, I wouldn't be surprised to see something happen at the 4/5 boundary. I think the most you can go to is 7 cores, as the last core needs to spend time feeding the GPU and doing other things. I can easily forget about ClimatePrediction until I get through this startup period, as I can do SETI and WorldCommunity jobs until then (SETI is sending GPU jobs as well). |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The lib that's usually missing is libjpeg.so.62 Running ldd against all of the files apparently shows anything that's missing, but that's only after you get them in the first place. The UK Met Office only supplies 32 bit modelling programs to the professional researchers, and we're just along for the ride. I run Mint, which is a fork of Ubuntu, which is a fork of Debian. The problem that I had, was that the files were installed, and then removed again during the clean up, as there is a later version 8 of that lib. So I installed 32 bit Mint on an old machine, found the files, and copied them via a ram stick. |
Send message Joined: 1 Sep 05 Posts: 3 Credit: 10,373,807 RAC: 2,319 |
I too have been trying to get climateprediction.net to run on Debian Stretch. It has not run anything for a month until now and it stopped after a few seconds. I ran an strace and ldd on wah2am3m2_um_8.12_i686-pc-linux-gnu with the following output. The program fails for incorrect arguments but I cannot determine what is missing. It looks that all the i386 files are present. Linux saint-marie 4.8.0-2-amd64 #1 SMP Debian 4.8.15-2 (2017-01-04) x86_64 GNU/Linux boinc 7.6.33+dfsg-5 tony@saint-marie:/work/workspace/BOINC/projects/climateprediction.net$ strace ./wah2_8.12_i686-pc-linux-gnu execve("./wah2_8.12_i686-pc-linux-gnu", ["./wah2_8.12_i686-pc-linux-gnu"], [/* 55 vars */]) = 0 strace: [ Process PID=23222 runs in 32 bit mode. ] brk(NULL) = 0xa00e000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap2(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf76ec000 access("/etc/ld.so.preload", R_OK) = 0 open("/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 close(3) = 0 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=147425, ...}) = 0 mmap2(NULL, 147425, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf76c8000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300O\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=132408, ...}) = 0 mmap2(NULL, 115296, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf76ab000 mmap2(0xf76c4000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18000) = 0xf76c4000 mmap2(0xf76c6000, 4704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf76c6000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\n\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=13860, ...}) = 0 mmap2(NULL, 16488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf76a6000 mmap2(0xf76a9000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0xf76a9000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/usr/lib/i386-linux-gnu/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260\331\6\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=1529100, ...}) = 0 mmap2(NULL, 1540472, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf752d000 mmap2(0xf769c000, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16e000) = 0xf769c000 mmap2(0xf76a3000, 8568, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf76a3000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340F\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=341556, ...}) = 0 mmap2(NULL, 344144, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf74d8000 mmap2(0xf752b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x52000) = 0xf752b000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\240 \0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=116312, ...}) = 0 mmap2(NULL, 119380, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf74ba000 mmap2(0xf74d6000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b000) = 0xf74d6000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360\203\1\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1787812, ...}) = 0 mmap2(NULL, 1796636, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf7303000 mmap2(0xf74b4000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b0000) = 0xf74b4000 mmap2(0xf74b7000, 10780, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf74b7000 close(3) = 0 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7301000 set_thread_area({entry_number:-1, base_addr:0xf7302140, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 (entry_number:12) mprotect(0xf74b4000, 8192, PROT_READ) = 0 mprotect(0xf74d6000, 4096, PROT_READ) = 0 mprotect(0xf752b000, 4096, PROT_READ) = 0 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf72ff000 mprotect(0xf769c000, 24576, PROT_READ) = 0 mprotect(0xf76a9000, 4096, PROT_READ) = 0 mprotect(0xf76c4000, 4096, PROT_READ) = 0 mprotect(0xf7716000, 4096, PROT_READ) = 0 munmap(0xf76c8000, 147425) = 0 set_tid_address(0xf73021a8) = 23222 set_robust_list(0xf73021b0, 12) = 0 rt_sigaction(SIGRTMIN, {sa_handler=0xf76af9f0, sa_mask=[], sa_flags=SA_SIGINFO}, NULL, 8) = 0 rt_sigaction(SIGRT_1, {sa_handler=0xf76afa70, sa_mask=[], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 ugetrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0 uname({sysname="Linux", nodename="saint-marie", ...}) = 0 brk(NULL) = 0xa00e000 brk(0xa033000) = 0xa033000 futex(0xf76a39f4, FUTEX_WAKE_PRIVATE, 2147483647) = 0 futex(0xf76a39fc, FUTEX_WAKE_PRIVATE, 2147483647) = 0 write(2, "Main program exiting, invalid ar"..., 69Main program exiting, invalid arguments: need UMID and 10 args total ) = 69 chdir("") = -1 ENOENT (No such file or directory) chdir("") = -1 ENOENT (No such file or directory) fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 write(1, "Cleaning up from the run...\n", 28Cleaning up from the run... ) = 28 open("tmp", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory) rmdir("tmp") = -1 ENOENT (No such file or directory) open("dataout", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory) rmdir("dataout") = -1 ENOENT (No such file or directory) open("jobs", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory) rmdir("jobs") = -1 ENOENT (No such file or directory) open("datain", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory) rmdir("datain") = -1 ENOENT (No such file or directory) chdir("..") = 0 stat64(".xml", 0xff9973c4) = -1 ENOENT (No such file or directory) rmdir("") = -1 ENOENT (No such file or directory) chdir("") = -1 ENOENT (No such file or directory) write(2, "Could not change to slots direct"..., 37Could not change to slots directory ) = 37 open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=2453, ...}) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=2453, ...}) = 0 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 2453 _llseek(3, -24, [2429], SEEK_CUR) = 0 read(3, "\nMST7MDT,M3.2.0,M11.1.0\n", 4096) = 24 close(3) = 0 write(2, "07:37:10 (23222): called boinc_f"..., 4107:37:10 (23222): called boinc_finish(2) ) = 41 nanosleep({tv_sec=2, tv_nsec=0}, 0xff997378) = 0 exit_group(2) = ? +++ exited with 2 +++ tony@saint-marie:/work/workspace/BOINC/projects/climateprediction.net$ ldd wah2_8.12_i686-pc-linux-gnu linux-gate.so.1 (0xf7767000) libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xf7722000) libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xf771d000) libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xf75a2000) libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xf754d000) libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xf752f000) libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf7378000) /lib/ld-linux.so.2 (0x56627000) tony@saint-marie:/work/workspace/BOINC/projects/climateprediction.net$ locate linux-gate.so.1 tony@saint-marie:/work/workspace/BOINC/projects/climateprediction.net$ locate /lib/ld-linux.so.2 /lib/ld-linux.so.2 tony@saint-marie:/work/workspace/BOINC/projects/climateprediction.net$ locate libjpeg.so.62 /usr/lib/i386-linux-gnu/libjpeg.so.62 /usr/lib/i386-linux-gnu/libjpeg.so.62.2.0 /usr/lib/x86_64-linux-gnu/libjpeg.so.62 /usr/lib/x86_64-linux-gnu/libjpeg.so.62.2.0 |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,976,682 RAC: 21,948 |
Looking at the task page I see SIGSEGV: segmentation violation This usually means a problem with the actual task as opposed to your computer. I will see if there are any more linux tasks that I can grab to check it out. In the meantime, there is lots of windows work available which will run under WINE with few if any problems in my experience. Though I would like to go back to running native Linux tasks. |
Send message Joined: 1 Sep 05 Posts: 3 Credit: 10,373,807 RAC: 2,319 |
I went ahead and created a 32-bit Lubuntu (1 cpu/1 GB RAM/5.7 GB SSD free) in VirtualBox with only CPDN running now, to see how that does. I am just waiting for a task to download. I have all other Boinc projects running in Linux so I will avoid Wine on the system. Tony |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,976,682 RAC: 21,948 |
This is a re-issue because it has already failed on another machine. - The _1 at the end of the task name indicates this. There has been talk of some hadcm3s tasks in beta testing recently, though I have yet to see any. When these make the main site there should be some native Linux work again. |
Send message Joined: 1 Sep 05 Posts: 3 Credit: 10,373,807 RAC: 2,319 |
OK. I will switch to Wine inside the Lubuntu 32-bit VM. Thanks |
Send message Joined: 28 Jun 06 Posts: 20 Credit: 1,349,578 RAC: 0 |
I pulled my RX460 from my one computer, as basically it was performing worse than a HD6450 card. I then looked at what was happening in Debian (in the vicinity of a code freeze). Amdgpu is nominally in the kernel, and support depends on what kernel is available. I am mostly running Debian/stable, but also keeping in mind that I want to jump to Devuan at a convenient point. Libclc was the obvious culprit, but not the only one. Libclc is the culprit in conjunction with Mesa3D, and support for Polaris was mostly coming from LLVM. A couple of weeks ago(?), Mesa3D came out with 17.0.0 (and the 17.0.1 point release after that). As I was tracking the development of LLVM-4.0.0, I didn't spend time trying to compile Mesa3D. But a week or so ago(?), LLVM-4.0.0 was released. And consequently I have a copy of the LLVM family sitting in /usr/local. Trying to compile Mesa3D with LLVM-4, I ran across a small problem (which could be because I am not familiar with these libraries and tools). But, an email message in the -dev mailing lists says there is a bug in LLVM-4, which seems likely to me to affect doing GPU work. It appears in the near term that Mesa3D will try to work around the problem, but Mesa3D is hoping that LLVM fix the bug "soon". At some point, I will be able to have a Mesa3D set of functionality in /usr/local that is significanty newer than what is in Jessie or Jessie-backports. At which point, it would seem to make sense to go back to libclc, and see if I can get anything at all working with the RX460 card. There also seems to be reasons to hope for help when the Linux-4.10 kernel makes its way into backports (4.9 is there at the moment). Lately I started to get a little dribble of work from ClimatePrediction, and then I read that you are going to stop producing models across Windows/Mac/Linux, and assign models to architecture based on perceived universe of machines available. Which bothers me, as I would hope that your work would have been architecture neutral (sort of, maybe). I am interested in local weather, and climate change (I am at 56N and 120W). Among other things, I want to start modeling surface winds using WindNinja and another package I haven't memorised the name of, based on likely weather scenarios. So, I have asked Environment Canada about this, and that got me looking into statistical downscaling software. You read about some package that looks useful, and after joining, you find out it is binary (Windows only). Or you find out there is source code, but you need to take approved classes before you are allowed to see the source code. Or .... I've been doing numerical methods since 1980. I happen to live downwind of a 100+ MW windfarm, and I can get data from the windfarm to check my work against. The minor in my M.Eng. is statistical mechanics. I don't need to take some dumb class. But, sensitivity of model output to operating systems is going to influence how I try and get WindNinja and this other source code working with the AMD GPUs I have (1 HD-5450, 2 HD6450, 1 R7-250 and 1 RX460) along with 16 AMD64 CPU cores. I may be up to 22 CPU cores before I get the RX460 actually working. If this isn't sufficient horsepower for a patient researcher (I am used to waiting days for a model to run), I will look at using BOINC and getting people in my region to devote cycles. Apparently in the winter (winter is "ending" about now), most of our weather comes from the Gulf of Alaska. Bogosolov (sp?) has put up two ash clouds in the last few weeks that I have seen in the news. Both times, the Alaska Volcano Observatory show the clouds heading towards Japan, and not here. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
and assign models to architecture based on perceived universe of machines available. Times change, and in the nearly 2 years since that post was made it's become "assign time to writing new versions of the programs in the order of number of computers available to the project". And there's a new favourite program to code for. |
Send message Joined: 28 Jun 06 Posts: 20 Credit: 1,349,578 RAC: 0 |
I thought I was reading off a post from 2017. What is causing the OS dependence on results? I am guessing your models aren't in FORTRAN, like the ancient weather models. In terms of numbers, does uptime also come into things? I think the average number of shutdowns my machines see is something like 2 per year (power goes out here, for longer than my batteries can last). Are Windows people also running 24/7/365? I have seen some people working on using LLVM on Windows. And people compiling Linux with LLVM stills seems to be unusual. When I go to port this stuff to my GPUs, should I try gcc and LLVM? Oh, in my inventory, 4 of those AMD64 cores are an APU, which also has R7 graphics. So, effectively I have 2 R7-250, not 1. And if needed, I may set up a compute server. Ryzen R5-1600X with multiple RX460 looks to be a nice place, and not use too much electricity. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
If you're talking about my post in this thread, that's just pointing to an old copy else where, as the original post from Hannah has disappeared. And the exact details/reasons for all of this doesn't matter much. This is the way things are. "Things" are happening "behind the scenes", and someday there's a hope that they'll show up on this main site. The huge main programs are FORTRAN, the auxiliary programs some form of C. What is causing the OS dependence on results? The usual business reasons: time, man (and woman) power, etc. |
Send message Joined: 1 Sep 04 Posts: 161 Credit: 81,522,141 RAC: 1,164 |
Les - Is this the info that you reported as disappeared? http://www.climateprediction.net/future-weatherhome-applications-will-only-run-on-a-single-operating-system/ |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
That's it. (Found it while idly browsing. :) ) And that's where the link in my post leads to. It's possible that some/lots of posts from the message boards was/were placed in the front pages a long time ago. |
Send message Joined: 6 Aug 04 Posts: 124 Credit: 9,195,838 RAC: 0 |
Les - Doesn't that also apply to WINE? Linux Users Everywhere @ BOINC |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,976,682 RAC: 21,948 |
And a bunch of hadcm3 tasks has been released, though about two hours ago, only a few hundred left. Edit: Five out of six were resends having failed on at least one machine already. Five failed at exactly 19 seconds across two differant machines. The last one managed just over two minutes. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
I've got two of these so far at 12h with no problems. One is a reissue failed due to missing 32bit libraries. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,976,682 RAC: 21,948 |
Some of mine have failed with <core_client_version>7.8.3</core_client_version> And others with <core_client_version>7.8.3</core_client_version> They have all failed at least once with the same problem on a linux or mac. Only one has a failure from missing libs as one of its crashes |
©2024 cpdn.org