Message boards : Number crunching : Computer wasting multiple models
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next
Author | Message |
---|---|
Send message Joined: 5 Feb 05 Posts: 17 Credit: 1,582,791 RAC: 0 |
I found a few too (all with me on WU 6668123) These two seem to have all errors (zero CPU time, Average turnaround time 0 days, Avg. credit 0.00): host 1048960: 43 tasks host 922470: 850 tasks These two produce lots of errors (with zero CPU time), but once in a while they report a task successfully: host 991013: 867 tasks host 1042139: 75 tasks |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
Thanks for the report Starfire. These two produce lots of errors (with zero CPU time), but once in a while they report a task successfully: The task list is ordered by task id but that doesn\'t reflect the order tasks are sent out because CPDN creates large batches of work which is randomly sent out. It looks like the user fixed the problem with that one on 11th February. host 1042139: 75 tasks Similarly, this one was fixed on 3rd February. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 5 Feb 05 Posts: 17 Credit: 1,582,791 RAC: 0 |
The task list is ordered by task id but that doesn\'t reflect the order tasks are sent out Sorry, I completely forgot about that :( I\'ll check the time frame more closely in the future. |
Send message Joined: 31 Aug 04 Posts: 42 Credit: 547,031 RAC: 0 |
Here are three that seem to guzzle WU. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1044952 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=221382 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=793420 Quite happy to spend some time going through more if thats whats needed/wanted. |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Here are three that seem to guzzle WU. Thanks - I\'d already caught one of those but not the other two, and they have now been dealt with. If the max_results_day is set to \"-1\" then it means that they\'ve been done. When they\'re fixed I\'ll set that back to a proper value. |
Send message Joined: 31 Aug 04 Posts: 42 Credit: 547,031 RAC: 0 |
Here are three that seem to guzzle WU. Thanks for the info. Last 5 for today. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=956437 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=950613 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1054067 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1026816 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1053686 |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,903,241 RAC: 2,063 |
|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
All done - thanks, Iain. |
Send message Joined: 31 Aug 04 Posts: 42 Credit: 547,031 RAC: 0 |
A few more : The first computer seems to have 60 wu in progress and the rest detached. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1001846 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=872876 http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1040010 There seem to a lot of crunchers who only have downloaded say 10 or less WU, but all have failed. Do you want us to report these as well? This one for example: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1060272 |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Martin We shouldn\'t report to Milo computers with failures over just a short period. For example, the computer #1060272 you asked about had a series of download errors with HadAM3P over two days, but there was something wrong at the server end and everybody was getting download failures with HadAM3P. Now that Milo\'s found a fix, the computer\'s crunching and has trickled. Confusingly, the task page for that computer lists some of these download failures correctly but lists other identical failures as Error. So one doesn\'t see immediately what\'s going on. We also need to give people time to notice the problem and try to sort it out. So we\'re reporting computers that are trashing models bigtime, longterm. BTW, if you type [ url]paste web address here[/url] (leaving out the space inside the first two tags) you\'ll find your post contains a live link. Cpdn news |
Send message Joined: 31 Aug 04 Posts: 42 Credit: 547,031 RAC: 0 |
Thanks for the rely Mo.v - its more laziness than ignorance when it comes to formatting posts. I\'ll make them clickable in the future though. Just to say http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1001846 has wasted over twenty more WU since I pointed him out in my last post on this thread. I do know your busy, and I don\'t won\'t to nag, but I thought it worth mentioning.:) |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Thanks for the rely Mo.v - its more laziness than ignorance when it comes to formatting posts. I\'ll make them clickable in the future though. I put the 6 back in your URL in this reply. I clicked on it before and came up with a AMD \"Pentium\" PC from 2005. :) That is a strange host as it is downloading, but never erroring or returning results. I wonder what the task list looks like in BOINC Manager. Not that the owner ever looks at it obviously... |
Send message Joined: 31 Aug 04 Posts: 42 Credit: 547,031 RAC: 0 |
I really haven\'t a clue whats going on with the internal links on this forum, but my last post refers to a member called \"marquexa\". When I click on my link I get \"dolce\". I\'m confused, as the link is the same as in my original post. Free Image Hosting by ImageBam.com FWIW,the clickable \"this post\" url tags also didn\'t work for me. Jeez, you wouldn\'t think I\'ve been on line for the better part of 15 years given the state of my posts :) |
Send message Joined: 31 Aug 04 Posts: 42 Credit: 547,031 RAC: 0 |
Thanks geophi, Please ignore my last post. I\'ll keep taking the pills and have a few more early nights... Ho hum... |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
I\'ve done \"marquexa\" as the host mentioned is indeed a bit odd. I have also hacked the script that e-mails the owners of dodgy hosts so that it automatically cuts them off, to save me a bit of time. If they report on the relevant thread that they\'ve performed the steps asked of them to fix their host then I\'ll remove the block manually. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
It would be better if no more models were sent to computer 984556 977058 1032269 1039628 1048252 1048246 1048245 1039633 1039629 1039641 The single owner of all the second group has already received an email about another of his computers but has not yet solved the problem or asked for advice. I\'ve created a Trac ticket in the hope that if it\'s implemented a lot of red crash messages in the Boinc manager would alert a few more members with problem computers to the need for action. Cpdn news |
Send message Joined: 28 Nov 06 Posts: 89 Credit: 12,124,839 RAC: 4,320 |
One typical SABOTEUR - 1006007. |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
I\'ve done another pass on the machines mentioned above. |
Send message Joined: 28 Nov 06 Posts: 89 Credit: 12,124,839 RAC: 4,320 |
|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
All done, thanks. |
©2024 cpdn.org