Message boards : Number crunching : Computer wasting multiple models
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · Next
Author | Message |
---|---|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
|
Send message Joined: 17 Aug 04 Posts: 289 Credit: 44,103,664 RAC: 0 |
|
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Yes, I'd say both need the email. Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
All done, thanks. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
That's very strange. Anyway, all are done now. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Done, thanks. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Milo, there's something wrong with how your script minusses computers. When I look at the last computer list, some are minussed but some are not. I'm surprised that the script works in some cases but not others. I wonder whether your script is perfectly OK but perhaps the new Boinc server version isn't handling all computers' daily quotas correctly. If so, this problem could conceivably have the same root cause as the inability of some computers to fetch work. Perhaps your script's minussing works at the time but the server's Boinc undoes the minus value immediately or later. That's just my speculation. The email part of your script is definitely working; Marta's annoyance in the thread where these people are invited to post seems to indicate that she had received the misconfiguration email twice: the first time and then again when you redid her computer. Yet she's still got a daily quota of 1 and is still downloading, and of course crashing, more models. I'm going to trawl through the lists of computers I posted, going back in time to see when this problem of the script not always minussing computers started. If I were you I wouldn't try the script again yet. My list will be in the next post. Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Computers not minussed, by date 24 June http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=845926 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1074635 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1072200 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1061054 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1066771 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1075433 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1070985 20 June Done twice by Milo's script: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=997756 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1055530 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1044825 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1034106 (this is Marta) http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1070985 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1067727 Done once only http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1063530 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1056839 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=951134 30 May - 1 June I'm not checking Byron's list as I don't know which of them Milo minussed. 28 May All still minussed (Milo reenabled certain computers from this and some subsequent lists. I've gone through our records to avoid listing these.) 26 May All still minussed 18 May All still minussed 14 May All still minussed 22 April All still minussed 21 April All still minussed 20 April http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=896382 That's far enough back. I haven't listed computers that Milo reenabled. I can't identify the anonymous person (20 April) from our records. There's certainly a difference in the server's treatment of daily quotas before and after the Boinc upgrade, but its strange new behaviour only affects certain computers. It may be, as Thyme Lawn is wondering in a moderator thread, that quota allocation now works by application type. We'll need to wait patiently until he has time to dig into the code to find out what it does. Cpdn news |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Thanks, Mo. There is indeed something amiss here; I've been checking as I run the 'minus' script and it does indeed update the database, so I suspect that the server upgrade is indeed behind this. As for the e-mails, they will have to continue as we should offer people an means to seek assistance rather than disconnecting them without warning. As I mentioned above, those who cannot or do not want to seek assistance would be best off detaching non-working machines as they will then not be e-mailed about them. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Thanks for checking the script even on such a gloriously sunny Saturday morning. I have a collection of links to quite a few more computers that crash all their models. I'll hold onto them for the time being to see whether the quota problem can be fixed before I post them. Really, the computers' owners are as you say fortunate to receive the offer of assistance in the email and it's better for CPDN to have fewer models crashed by misconfigured computers. Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Milo, now that you can again minus computers that waste lots of models I'm going to repost the IDs of computers you tried to minus earlier but couldn't. I'll omit any computers no longer crashing models. The following people should all have received your email without being minussed but took no action. So another email and a -1 quota seem very reasonable. 845926 1072200 1066771 1055530 1067727 1063530 Cpdn news |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Done - thanks. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Thanks, Milo. There's another group of computers I kept on hold while you were unable to minus them. They were getting the -226 code with the lockfile error. You sent them an email advising them to upgrade their Boinc to fix this. Unfortunately, many did not take your advice. I'm reporting members in this position who are still crashing models. They need the standard email this time and to be minussed. I have great sympathy with the last member in the list who did upgrade his Boinc, after which another error type emerged. But he does need further advice. 911520 1007769 1006227 1012950 1072223 1005628 1014568 228135 997848 Cpdn news |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Those are all done as well. |
Send message Joined: 13 Aug 05 Posts: 54 Credit: 117,227 RAC: 0 |
|
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
That's a disastrous workunit apart from one computer. We will have to let the last two members you listed (computers 1092307 and 1095034) try for longer to see whether they can fix the problems themselves, but if after two or three more weeks they still can't process anything Milo will have to minus them. (The last has dreadful error messages.) The first three should definitely receive the email in my opinion. Cpdn news |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
OK, the first three are done. |
©2024 cpdn.org