climateprediction.net home page
2 systems reporting to 1 in your DB.

2 systems reporting to 1 in your DB.

Questions and Answers : Windows : 2 systems reporting to 1 in your DB.
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user45728

Send message
Joined: 29 Jan 05
Posts: 15
Credit: 2,316,176
RAC: 0
Message 11030 - Posted: 17 Mar 2005, 13:28:22 UTC

Hello,
Somehow I managed to get 2 system reporting to 1 "computer" in your database. Computer ids, 132554 and 132996 are both tallying credits to 132996.

I know how this happened but I don't know how to fix it. 132554 is 87% done and I don't want to reinstall Boinc until the WU is done - in about 4 days.

Any thoughts?
ID: 11030 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 11051 - Posted: 17 Mar 2005, 22:28:07 UTC

You have your computers hidden, and without some idea of what happened, it is unlikely anyone can help.

1) I would suggest that you suspend BOINC, select the entire folder, and then copy it to another area of your HD.
This will help if something goes wrong. Next, do the reverse of whatever you did before.

or

2) Wait until the current model finishes, detach 132554 from the project, then start again.

Someone else may have other ideas if you wait long enough.

Les
ID: 11051 · Report as offensive     Reply Quote
old_user45728

Send message
Joined: 29 Jan 05
Posts: 15
Credit: 2,316,176
RAC: 0
Message 11052 - Posted: 17 Mar 2005, 23:58:46 UTC - in response to Message 11051.  

Thanks Les, I will take your suggestion after the current model finishes.

I've un-hidden my computers. It seems the credit for 132554 is going nowhere, it's always zero...oh well.

Thanks again,

ID: 11052 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 11064 - Posted: 18 Mar 2005, 8:14:55 UTC
Last modified: 18 Mar 2005, 8:16:07 UTC

Another thought since my last post.

There is a bug in the server software which occasionally allocates extra credits when a model finishes.
Credits are only supposed to be allocated each trickle. When the problem couldn't be found, a program was written
which runs every 4 hours or so, re-allocating credits from another source, (trickles?), forgotten what.
So. If you can separate your two computers, when this program runs, it may give you the correct values for each.
It will depend on where the data comes from for the fix.

******************

Before you do anything else, I have a couple more ideas.
(BTW: detaching will kill the 2nd model on that computer.)

Look at the cross-project ID on each of the 2 computers and see if they are the same.
I'm flying blind here, as I only have one computer.
Is it the same on all your computers?
If only the 2 in question are the same, then it may be possible to edit the xml file.

But first, something simpler. And sneeky.
I had a little accident last year, and dragged the closed BOINC folder down a bit.
Every thing seemed to still work, and I was busy, so it was the next day when I got around to checking up.
MOST OF THE FILES IN BOINC WERE MISSING! Found them in a folder just below, and tried to move them back.
But some were already there! It seems that they got re-created.
These were: client_state, client_state_prev, global_prefs, sched_request, and sched_reply.

So. When you're ready to fiddle, after making a backup of the BOINC folder, Suspend BOINC, make a temp folder,
and MOVE the client and sched files to it.
Then let BOINC run again and see if you get a fixed version of the files. You may have to Update for this.

If not you'll have to operate. Somewhere there is a line or two which tells the system which computer it is.
I'm hoping it's the CPID.
I'm also hoping that it may be possible to alter the one which is wrong.

I think it may be best to try this BEFORE upgrading, as you won't have the 'signed' bits to deal with.

Post back when you've had a look at the CPIDS.

Les

ID: 11064 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 11065 - Posted: 18 Mar 2005, 9:12:06 UTC - in response to Message 11052.  

> I've un-hidden my computers. It seems the credit for 132554 is going nowhere,
> it's always zero...oh well.

Something very strange has happened here. Your trickles are being recorded against host 132554 but the results (and hence the credits) are registered to host 132996.

132554 has been merged at least twice (previous hosts 112644 and 131945) and has no results registered against it. Which should mean that no work has ever been downloaded to that system.

It looks like 132996 has also been merged at least twice (previous hosts 112645 and 113057) and has 37 results registered against it, including those run by all incarnations of 132554.

The systems have different CPU speeds so there's no way they can ever be the same host. Did you download work to one system and then manually transfer it over to the other?

As things currently stand all the work you currently have on host 132554 is registered (and going to be credited) to 132996. 132554 will only start getting credits for itself when it downloads its first workunit.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 11065 · Report as offensive     Reply Quote
old_user45728

Send message
Joined: 29 Jan 05
Posts: 15
Credit: 2,316,176
RAC: 0
Message 11073 - Posted: 18 Mar 2005, 12:31:08 UTC - in response to Message 11065.  

> Did you download work to one system and then manually transfer it
> over to the other?

Well, actually yes, yes I did...sort of. In the processs of building multiple systems, we build one system with a mirrored system disk. (all of ours have mirrored system disks) When that OS is built with everything set as we want it. We break the mirrored set and pop a new drive in and re-mirror the set. Now we can use the drive we took out as a system disk in a new machine...instantly building the new OS...etc. for n = the number of machines to build. The mirroring takes about 2 minutes, building the whole OS takes a few hours. We bounce the machine into WORKGROUP, change the name and re-join to the domain and we're done. So, that's the long version of "manually transfer it".

When I reinstalled Boinc on the "new" machine, the boinc uninstaller did NOT remove EVERYTHING. So, when I installed it, boinc took off and started with the original files and claimed the original computer in CPDN. By the time I knew what happened and stopped boinc, it was too late. I should have kept it off the network until I was done.

I have a model running on the "bad" machine, it's over 90% done. Once it's uploaded I'll detach from the project and re-attach. I'm hoping that will create a new computer in CPDN. Interestingly, this "bad" machine, 132554, shows up on CPDN for me with the option to "delete this host"...no others do.


ID: 11073 · Report as offensive     Reply Quote
Profile Keck_Komputers
Avatar

Send message
Joined: 5 Aug 04
Posts: 426
Credit: 2,426,069
RAC: 0
Message 11074 - Posted: 18 Mar 2005, 14:10:52 UTC
Last modified: 18 Mar 2005, 14:14:09 UTC

@N
When you you create your mirror install BOINC but do not run it. Copy account*.xml file(s) from working computers to the mirror. That way when you start up the new computer(s) generated from the mirror it will generate the host records and not get the servers confused.

[edit] If this is not possible due to setting up as a service or whatever, delete the client_state.xml file and projects directory before the copy. You can also delete the slots directory but this should not be needed since BOINC should regenerate it as it runs.
BOINC WIKI

BOINCing since 2002/12/8
ID: 11074 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 11078 - Posted: 18 Mar 2005, 15:20:47 UTC

N,
This is a bit beyond me. Best if you follow what Thyme Lawn and JKeck advise.
Hope it works out OK.

Les
ID: 11078 · Report as offensive     Reply Quote

Questions and Answers : Windows : 2 systems reporting to 1 in your DB.

©2024 cpdn.org