climateprediction.net (CPDN) home page
Thread 'Misconfigured Machine?'

Thread 'Misconfigured Machine?'

Message boards : climateprediction.net Science : Misconfigured Machine?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 31,423,691
RAC: 15,550
Message 62599 - Posted: 29 Jun 2020, 22:24:07 UTC

Three machinesd with loads of errors:

1325065
1421631
1504478

Seem to be giving file/directory missing error.
ID: 62599 · Report as offensive     Reply Quote
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 31,423,691
RAC: 15,550
Message 63256 - Posted: 4 Jan 2021, 0:09:30 UTC
Last modified: 4 Jan 2021, 0:12:19 UTC

Possible missing libraries on 1481742 and 1507580. Crashed almost all tasks they've had.
ID: 63256 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63312 - Posted: 11 Jan 2021, 17:32:22 UTC

I don't see any enforcement of this. The bad machines still pop up regularly on practically all of my work units.
ID: 63312 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63315 - Posted: 11 Jan 2021, 18:26:43 UTC - in response to Message 63312.  
Last modified: 14 Jan 2021, 15:04:21 UTC

I don't see any enforcement of this. The bad machines still pop up regularly on practically all of my work units.


I have stopped reporting misconfigured machines. I have asked Andy if this is a permanent decision to not do anything and if so I will report back and lock the thread.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63315 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,753,277
RAC: 7,654
Message 63406 - Posted: 24 Jan 2021, 16:59:18 UTC

These two are crashing 100% of WUs
https://www.cpdn.org/results.php?hostid=1484543
https://www.cpdn.org/results.php?hostid=1497751

Does it make sense to report machines or I just waste my time?
ID: 63406 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63407 - Posted: 24 Jan 2021, 17:09:24 UTC - in response to Message 63406.  

These two are crashing 100% of WUs
https://www.cpdn.org/results.php?hostid=1484543
https://www.cpdn.org/results.php?hostid=1497751

Does it make sense to report machines or I just waste my time?


No reply from Andy yet. I will ask again.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63407 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 63411 - Posted: 25 Jan 2021, 11:00:28 UTC - in response to Message 61781.  

The original thread was started by Mo about 6 years ago.
I wondered myself, why here, but I never got around to asking her.

______________
Les, where is Mo?
ID: 63411 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63412 - Posted: 25 Jan 2021, 13:44:46 UTC

I have emailed the moderators group which includes both Andy and Sarah to ask again as the email to just Andy didn't get a reply. (He may be on leave.) As I said earlier if a decision has been made not to stop tasks going to boxes without the 32bit libraries I will lock this thread.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63412 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 63417 - Posted: 26 Jan 2021, 17:08:45 UTC - in response to Message 63411.  

KAMasud

Mo dropped out of the project a few years ago
ID: 63417 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63434 - Posted: 30 Jan 2021, 9:58:04 UTC

Andy has said he is interested in blocking at least the most egregious machines, - Probably too much work for the return doing all the two and four core ones but given that some of the offenders blocked in the past have been sixty-four or more cores it may get the number of hard fails down a bit on Linux. I will have a look at recent posts and send him a list.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63434 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 63438 - Posted: 31 Jan 2021, 4:42:34 UTC - in response to Message 63417.  

O' Thanks.
ID: 63438 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 63439 - Posted: 31 Jan 2021, 5:07:04 UTC

Talking about misconfigured machines. I have a bit of a problem. Tenth gen Intel i7. Boinc problems. I tried to suspend CPDN so that I get no further work. Boinc does not understand the term suspend, everyday a new lot arrives. I try to exit Boinc, upfront it exits but in Taskmanager it is still working. I want all work to finish somehow so that I can boot up in Safe Mode, uninstall Boinc and do a clean install again, but Boinc is adamant. Boinc, at least on my machine has started behaving like a virus but there are no further observations from anyone else. After twenty days of non-stop running, I rebooted and the WU errored out.
Is everyone 100% sure it is misconfigured machines or Boinc has COVID 19? At least in some cases, it seems so.
Now for a chatterbox friend of mine, Climate, Weather, Meteorology, etc. These things depend upon observations, not pure maths. There is a thread in the Cafe section where a friend of mine is sitting in a cave, contemplating his navel somewhere on a mountain top in Nepal. Join us there and please leave the rest alone.
ID: 63439 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63453 - Posted: 1 Feb 2021, 17:43:01 UTC

I have moved the discussion about the exit dialogue with Linux machines to the Linux section.

I have also give Andy a number of 16 and 8 core machines to stop them trashing lots of tasks. I did notice a number of those listed in this thread seem to be inactive now. Presumably the users gave up rather than looking here.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63453 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 63473 - Posted: 3 Feb 2021, 0:14:44 UTC

It looks like things are about to change.
It could take a few months though, so don't forget to keep breathing. :)
ID: 63473 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63479 - Posted: 3 Feb 2021, 6:47:21 UTC - in response to Message 63473.  

It looks like things are about to change.
It could take a few months though, so don't forget to keep breathing. :)


More specifically, Andy is likely to produce a script which will identify offending machines from the missing library error messages enabling work to those machines to be stopped.

This may affect machines new to CPDN and not just those who believe all they have to do is set and forget. If after initial failures you install the missing 32bit libraries and you are not getting work you will need to let us know via the forums rather than sending private messages to the moderators to get your boxes reinstated.

When this happens or is imminent I or another moderator will post in both the number crunching and Linux sections of the fora.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63479 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 64070 - Posted: 20 Jun 2021, 13:20:16 UTC

3878 errors. I am impressed by such dedication.
https://www.cpdn.org/results.php?hostid=1517479
ID: 64070 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64071 - Posted: 20 Jun 2021, 14:23:59 UTC - in response to Message 64070.  

:(
email sent.
ID: 64071 · Report as offensive     Reply Quote
wateroakley

Send message
Joined: 6 Aug 04
Posts: 195
Credit: 28,593,722
RAC: 9,155
Message 64086 - Posted: 26 Jun 2021, 14:20:31 UTC
Last modified: 26 Jun 2021, 14:21:28 UTC

ID: 64086 · Report as offensive     Reply Quote
mngn

Send message
Joined: 13 Jul 18
Posts: 38
Credit: 62,933,508
RAC: 84,702
Message 64385 - Posted: 19 Aug 2021, 17:52:00 UTC - in response to Message 60141.  

100% errors
https://www.cpdn.org/results.php?hostid=1517679
ID: 64385 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 64391 - Posted: 20 Aug 2021, 15:01:57 UTC - in response to Message 64385.  

100% errors
https://www.cpdn.org/results.php?hostid=1517679

E-mail sent...
ID: 64391 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : climateprediction.net Science : Misconfigured Machine?

©2024 cpdn.org