Message boards : Number crunching : XML stats problems
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 7 Aug 04 Posts: 2186 Credit: 64,822,615 RAC: 5,275 |
When looking at my hosts on boinccstats, any that I started in 2005 are not listed, while all those from 2004 are. |
Send message Joined: 12 Aug 04 Posts: 36 Credit: 488,399 RAC: 0 |
Is the host problem going to be resolved? <img src="http://www.boincstats.com/stats/project_graph1.php?pr=cp&table=hosts"> BOINCstats |
Send message Joined: 7 Aug 04 Posts: 33 Credit: 215,841 RAC: 0 |
I don't know how but I just got 1300 credit for something I havn't done. 1000 credit off an WU that I only just started and hasn't trickled once, and 200 off another one. What's going on? |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
Sounds like the misallocated WU problem See <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2577">this thread</a> Don't really want one WU being processed by more than one host. Since someone else has got further, it would be sensible to abort these WUs before you get too far. |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
Yes, it seems that the "fluke" I saw in the host XML has persisted. The problem is an invalid character in the host XML file. I believe one host has some illegal data in its p_vendor field. See this copy/paste from the XML: <host> <id>82994</id> <userid>19151</userid> <total_credit>567.105002</total_credit> <expavg_credit>0.058292</expavg_credit> <expavg_time>1112096428.038390</expavg_time> <p_vendor>^B</p_vendor> <p_model>Pentium</p_model> ... Notice the ^B inside the p_vendor tags. This is actually an undisplayable control character (the hex value of the byte is 02) and it causes any standards compliant XML parser to stop parsing and throw an error. I emailed tolu about it las week. On Tuesday db_dump was upgraded to the latest version however this still didn't fix the problem. Now it would seem that db_dump is turned off completely. I'm pretty sure all that is needed is a one-line SQL statement to change the p_vendor field of host number 82994 to something reasonable like 'Intel' (since it is a Pentium...). I replied to tolu on Tuesday night but haven't heard back. Someone care to poke them with a pointy stick? :) |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
According to <a href="http://www.climateprediction.net/board/viewtopic.php?t=2811&postdays=0&postorder=asc&start=30"> this</a> post by Hannah, Tolu has been given time off for good behavour. And about time too; he's earned it. Some of the moderators are looking for an alternate email address to contact Neil. In the meantime, good work Toby. This may be the problem. Or part of it, if there are more of these in the file. This is the top of the BOINCstats list for "users with 1 computer" before the recent problems: (based on my appearance as a 1 computer person) 1 Michael Siemon 93,308 2 quimillo 88,128 3 chejlava 87,542 4 Les Bayliss 86,087 5 Fred Pafford 84,801 6 Oddbits 83,629 7 David Wood 83,308 8 Timothy Pagen 78,960 9 Judd Goldberg 78,828 10 Bob Pauls 75,746 11 Cadgod 74,593 12 GwydionT 74,480 13 Quixote 74,404 14 Will 72,892 15 jckcpdn 72,079 16 Alex_Van_den_Broeck 71,625 17 ninesouls 71,002 18 Ian Mordant 69,395 19 poolball 67,864 20 Marcel Portanger 67,504 21 Magnus Kessler 67,372 22 neil_nn11 67,032 23 Bruce Boytler 66,937 24 Kai 65,576 25 Kricke 65,179 ******************************************* And this is how it is now: 1 Mark2112 668,031 2 old511 250,849 3 DigitalDarkness 219,904 4 davewave 207,239 5 Overseer 152,702 6 Alan J. Simpson 150,661 7 josti 144,876 8 Miroslav Deml 144,631 9 Ian Coole 142,854 10 Alan Lund 134,555 11 chejlava 128,846 12 Kernunos 127,977 13 Matthias Essenburger 124,045 14 Grenadier 120,926 15 KWSN - 3 119,905 16 KeeperC 119,489 17 Jerome de Nazareth 118,033 18 quimillo 115,444 19 Les Bayliss 114,442 20 Oddbits 113,875 21 Judd Goldberg 112,570 22 Jean-David Beyer 112,457 23 David Wood 110,434 24 CRAIC 109,565 25 piet 108,147 ******************************************* (It doesn't format well!) Most of those above me now, are actually multi computer people, (based on CPDN stats), but BOINCstats only lists one of their computers. The credits listed are for their totals, but the credits on their host stats page is for just the listed computer. This is the only list where I was doing OK. Sigh. Les |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
*sigh - still nothing. I have submitted a report via their web form as well. Still no response from that. Anyone alive over there? The solution is really simple... one line of SQL typed into a mysql console: update host set p_vendor = 'Intel' where id = 82994; And then I guess someone needs to figure out how to restart db_dump. Anyone happen to have the CPDN root password laying around? :) |
Send message Joined: 5 Feb 05 Posts: 465 Credit: 1,914,189 RAC: 0 |
I feel for you Toby. I really want to see stats, also. I am starving to see where the balls are all bouncing, etc. Knock, knock, is anyone out there? Do we need to make a project to find the people on earth who can fix these things? (-; <img src="http://seti.mundayweb.com/stats.php?userID=344&trans=off"> <img src="http://boinc.mundayweb.com/one/stats.php?userID=584&trans=off"> |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
> We are here but we are incredibly busy. Actually we = I while Tolu is away > :) > I've updated the mysql db as Toby suggested. I'm a little worried about how > this occured and that there may be more unprintable characters in the db. > Unfortunately I haven't figured out how to query for unprintables yet :) If you restart db_dump and give us some new XML I'll be able to tell you in about 2 minutes if there are any more bad characters :) db_dump was modified back in February to automatically filter out bad characters in most text fields such as user/team name as well as some host fields. I guess not all of the host fields were included - I might have to take that up on the stats mailing list. Of course this only fixes it on the XML side of things and as you said, random strange characters showing up in the DB are a bit worrysome. However if it IS just this one host, it could be that something went wrong on the host and the p_vendor field got some bad data in it from the client and the server just accepted it without question. As far ast the database is concerned, there is nothing wrong with the character. It is just when it gets dumped into XML that it becomes a problem since XML has a specific character set that is allowed. |
Send message Joined: 31 Aug 04 Posts: 239 Credit: 2,933,299 RAC: 0 |
> I've updated the mysql db as Toby suggested. I'm a little worried about how > this occured and that there may be more unprintable characters in the db. > Unfortunately I haven't figured out how to query for unprintables yet :) I am sure you tried this ... You should be able to make your WHERE clause to be WHERE column LIKE '%?%' where the ? mark is replaced by the bad character. you may have to use the character conversion to create the bad character. Of course, it would be easier if I knew what the bad character looked like ... |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
> Of course, it would be > easier if I knew what the bad character looked like ... That is the problem... the character in question doesn't "look" like anything and can't be typed on a keyboard. It was a reserved control character. The byte had a hex value of 02 in the XML which is a "start of text" character. I'm not sure if you can even query mysql on non-printable characters... It would have to give you a way to enter the ASCII/hex codes for the characters. Might be possible but I have never done it. |
Send message Joined: 12 Aug 04 Posts: 36 Credit: 488,399 RAC: 0 |
> I'm not sure if you can even query mysql on non-printable characters... It > would have to give you a way to enter the ASCII/hex codes for the characters. > Might be possible but I have never done it. > > You can do something like (PHP example): $sql = "SELECT * FROM table WHERE column LIKE '%".chr(02)."%'"; But I'm sure the developers are much better with SQL then I am. <a href="http://www.boincstats.com/"><img src="http://www.boincstats.com/stats/banner.php?cpid=a4c31845a9413499815907bae59136ac" border="0"> <img src="http://www.boincstats.com/stats/teambanner.php?teamname=BOINCstats" border="0"></a> |
Send message Joined: 16 Apr 05 Posts: 7 Credit: 13,526 RAC: 0 |
Just checked BOINC stats and................. Climate has been updated today!! Well done all concerned especially Toby. <img src="http://www.boincstats.com/stats/banner.php?cpid=dff6f0f73ac52826ebac01d8716dc4a8"> It's not the speed, but the quality - Until I get a faster computer |
Send message Joined: 31 Aug 04 Posts: 239 Credit: 2,933,299 RAC: 0 |
CHAR(N,...) CHAR() interprets the arguments as integers and returns a string consisting of the characters given by the code values of those integers. NULL values are skipped. mysql> SELECT CHAR(77,121,83,81,'76'); -> 'MySQL' mysql> SELECT CHAR(77,77.3,'77.3'); -> 'MMM' from the docs... if it is failing at an extract, you could create a "scan" using a similar query then see if there are any records that dont come out. if you have 50 records and only5 come out you know you have a problem, then create a minus query whre the ones that extract are taken from the full set and that will give you the list of the ones that have to be looked at ... I am poking in the dark cause I can't look at the queries or the table with the data ... |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
Wohoo! Did tolu come back or did someone else finally find the right button to push? :) There is going to be a big jump in my history graphs in about 7 hours! Now for the bad news: The hosts.xml file is even more damaged now :/ It *LOOKS* like the same problem seti@home experienced just a little while ago. It was discussed on the boinc_stats mailing list. Read here if you wish: http://www.ssl.berkeley.edu/pipermail/boinc_stats/2005-May/000095.html The problem in their case was that a user was deleted from the user table. This should <b>*NEVER*</b> <b><i>*EVER*</i></b> be done! Doing so leaves orphaned records in other tables which can cause problems like this and are very hard to track down. In seti's case the user was deleted and then db_dump went to look up the owner of a host to determine wether or not to put the <userid> tag in or not (if the user has his hosts hidden, it is not put in). This lookup failed and caused the host record to be terminated prematurely with no closing </host> tag which will also completely hose most XML parsers. Here is an example from the CPDN XML file: <host> <id>79</id> <host> <id>83</id> <userid>57</userid> <total_credit>24385.515355</total_credit> ... The record for host #79 is obviously truncated. There are a total of 34 hosts which have this problem. Here is a list and the command I used to get it from the XML (email might be better for this but since I'm not getting any response from anyone on the team...): grep -B 1 "<host>" host.xml | grep -v \\-\\- | grep -v "</host>" | grep -v "<host>" <id>79</id> <id>294</id> <id>549</id> <id>692</id> <id>2174</id> <id>2184</id> <id>3089</id> <id>3706</id> <id>4569</id> <id>4684</id> <id>6248</id> <id>6881</id> <id>7273</id> <id>7406</id> <id>7773</id> <id>8425</id> <id>8495</id> <id>8926</id> <id>9398</id> <id>10088</id> <id>10623</id> <id>11839</id> <id>13726</id> <id>14043</id> <id>15016</id> <id>18664</id> <id>19690</id> <id>20010</id> <id>20423</id> <id>20659</id> <id>21136</id> <id>21367</id> <id>21416</id> <id>21459</id> As to how to fix this... (*IF* this really is the problem - somene with access to the DB will have to verify) Deleting the offending hosts is probably not the way to go as that may cause other problems. Dr. Anderson did change db_dump.C to fix this problem but I'm not sure if it is in the public branch yet so you may have to get it from the development branch. You will want version 1.83 checked in on 5/12/05. You might also want to check with Dr. Anderson to double check... As I said, this is my theory since it looks the same as what happened to seti but I'm only on the outside looking in... And remember <b>NEVER</b> delete users! :) |
Send message Joined: 17 Aug 04 Posts: 56 Credit: 63,814 RAC: 0 |
*cough* anyone still there or is this project running itself while everyone is on vacation? The problem still remains... |
Send message Joined: 30 Aug 04 Posts: 142 Credit: 9,936,132 RAC: 0 |
Anyone else noticed how RAC is wrong on top team stats and btw on BOINC Synergy's stats as well? <a href="http://www.boincsynergy.com"><img src="http://www.boincsynergy.com/images/stats/308cpdn.png"></a> <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_text_search_form.php"> <b>SEARCH THE FORUM</b></a> |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes. It's RAC in general. <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2714"> Read this.</a> |
Send message Joined: 15 Aug 04 Posts: 57 Credit: 10,360,323 RAC: 1,102 |
It looks like theres a problem with the Total Credit now, my Stats haven't moved in the last 24 hours and I don't show any new Trickles turned in since about 10:30 AM yesterday even though I know I've turned in about 30-35 of them ... :? I looked at a few other peoples Accounts & they show the same thing, no trickles turned in since yesterday so it must be a server problem ... <img src="http://boinc.mundayweb.com/one/stats.php?userID=22"> |
©2024 cpdn.org