Message boards : climateprediction.net Science : Misconfigured Machine?
Message board moderation
Previous · 1 · 2 · 3 · 4
Author | Message |
---|---|
Send message Joined: 17 Aug 05 Posts: 22 Credit: 16,057,688 RAC: 15,434 |
Sadly it didn't help. This host is over 6000 failures now: https://www.cpdn.org/show_host_detail.php?hostid=1517479 Even astronomers can be neglective it seems :) It looks like 2 cloud computers. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Email sent re 2 of his computers. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,416,193 RAC: 15,520 |
Looks like he is using wierd locations for the task files that BOINC manager cannot find them. Possibly some sort of cloud storage. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Looks like he is using wierd locations for the task files that BOINC manager cannot find them. Possibly some sort of cloud storage. Could he have gotten his boinc-client and boincmgr from flatpak? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I haven't heard from Andy yet, so I've sent Eric an email about them. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,416,193 RAC: 15,520 |
Extract from the STD ERR output from one of the tasks on Eric's computer: unzip: cannot find or open /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_se_8.02_i686-pc-linux-gnu.zip, /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_se_8.02_i686-pc-linux-gnu.zip.zip or /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_se_8.02_i686-pc-linux-gnu.zip.ZIP. unzip: cannot find or open /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_um_8.02_i686-pc-linux-gnu.zip, /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_um_8.02_i686-pc-linux-gnu.zip.zip or /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_um_8.02_i686-pc-linux-gnu.zip.ZIP. unzip: cannot find or open hadsm4_data_8.02_i686-pc-linux-gnu.zip, hadsm4_data_8.02_i686-pc-linux-gnu.zip.zip or hadsm4_data_8.02_i686-pc-linux-gnu.zip.ZIP. unzip: cannot find or open hadsm4_a10a_201310_6_911_012090511.zip, hadsm4_a10a_201310_6_911_012090511.zip.zip or hadsm4_a10a_201310_6_911_012090511.zip.ZIP. cpdnmonitor: cannot open input file /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_se_8.02_i686-pc-linux-gnu.so after 11 attempts cpdnmonitor: cannot open input file /mydisks/a/boinc_lib/projects/climateprediction.net/hadsm4_um_8.02_i686-pc-linux-gnu after 11 attempts if it helps. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
I do not have any idea what his file system is like... My files tend to look like this: They are running and have not failed yet. /var/lib/boinc/projects/climateprediction.net/hadam4_um_8.09_i686-pc-linux-gnu /var/lib/boinc/projects/climateprediction.net/hadam4_um_8.09_i686-pc-linux-gnu.zip /var/lib/boinc/projects/climateprediction.net/hadam4_um_8.52_i686-pc-linux-gnu /var/lib/boinc/projects/climateprediction.net/hadam4_um_8.52_i686-pc-linux-gnu.zip /var/lib/boinc/slots/0/hadam4_um_8.09_i686-pc-linux-gnu.zip /var/lib/boinc/slots/1/hadam4_um_8.52_i686-pc-linux-gnu.zip /var/lib/boinc/slots/4/hadam4_um_8.52_i686-pc-linux-gnu.zip /var/lib/boinc/slots/6/hadam4_um_8.09_i686-pc-linux-gnu.zip /var/lib/boinc/slots/7/hadam4_um_8.52_i686-pc-linux-gnu.zip |
Send message Joined: 13 Jul 18 Posts: 38 Credit: 62,933,508 RAC: 84,702 |
All crashes. https://www.cpdn.org/results.php?hostid=1506194 |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,416,193 RAC: 15,520 |
Eric is still getting errors on his computer file system. I got another of his failures on repeat today. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
The first two were missing 32-bit libraries. I did not bother to look at the rest. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,416,193 RAC: 15,520 |
Still getting muliple failures from Eric's computer systems (file location errors) and also from Science United (similar type of problem). |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,748,307 RAC: 7,546 |
This one https://www.cpdn.org/results.php?hostid=1510055 has crashed all ~12k WUs and is continuing to do so. This one https://www.cpdn.org/results.php?hostid=829775 has been crashing all since 2020 had 67 valid before that And this one https://www.cpdn.org/results.php?hostid=1517479 has crashed all ~ 10k WUs and is continuing to do so. Can this reporting be automated somehow? The level of micromanagement CPDN requires and the reluctance of staff to adjust some basic things is becoming daunting. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I'll send another email tomorrow, when the weekend is over. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,416,193 RAC: 15,520 |
Another 2 apart from Eric's. 1515416 2.6k errors 1523499 2.2k errors Isn't it possible to "blacklist" machines with too many errors or unfinished tasks? |
Send message Joined: 6 Aug 04 Posts: 195 Credit: 28,588,752 RAC: 9,078 |
These hosts have crashed over 1,000 cm3s. 546: https://www.cpdn.org/results.php?hostid=1368852 362: https://www.cpdn.org/results.php?hostid=1492772 93: https://www.cpdn.org/results.php?hostid=1489800 87: https://www.cpdn.org/results.php?hostid=1368870 |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
MacOS, all crashes: 97: https://www.cpdn.org/results.php?hostid=1493719 112: https://www.cpdn.org/results.php?hostid=1368870 138: https://www.cpdn.org/results.php?hostid=1437487 134: https://www.cpdn.org/results.php?hostid=1433467 94: https://www.cpdn.org/results.php?hostid=1478457 189: https://www.cpdn.org/results.php?hostid=1441240 FreeBSD, nothing but errors: https://www.cpdn.org/results.php?hostid=1523499 (a lot of Insufficient Memory/Stack Space Available!) |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,416,193 RAC: 15,520 |
More crashers: 1532165 and 1477807 --Missing libraries 1524953 --odd errors 1517679 (Eric's) now up to 5889 failed tasks!!! |
©2024 cpdn.org