Message boards : Number crunching : Cross-project ID's question
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Feb 06 Posts: 89 Credit: 4,309,159 RAC: 0 |
This may be a naive question... Over the past week I have been suspending my cpdn project and night and then restarting in the morning. This morning it restarted and six (of eight) tasks prematurely crashed... I restored the eight tasks from a backup and rebooted. Boinc restarted ok but the new Boinc Manager Event Log created a new cross-project ID. Does anyone know if that new cross-project ID is temporary until the restored tasks catch up where the failed tasks last reported to the server? I am one user on the same machine so presumably I should have just one cross-project ID??? Thanks Suspending network activity - user request climateprediction.net | project resumed by user Resuming network activity climateprediction.net | update requested by user climateprediction.net | Sending scheduler request: Requested by user. climateprediction.net | Not requesting tasks: don't need (CPU: job cache full; NVIDIA GPU: no applications) climateprediction.net | Scheduler request completed climateprediction.net | Generated new computer cross-project ID: 987c000809c9212e8f54e88cb97e3041 |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,718,239 RAC: 8,054 |
Note that the wording is "Generated new computer cross-project ID". This is not the same thing as either the Computer ID shown for your computers on this web site, or the User CPID used by the cross-project statistics sites. So far as I can tell, the computer CPID is of no practical use and changes can be safely ignored. For peace of mind when restoring from backups, increase the <rpc_seqno> for the project(s) you are restoring in client_state.xml, to a figure greater than the "Number of times client has contacted server" shown in the computer details for the same machine - edit the file before you restart BOINC. On the other hand, I notice that you do have a new Computer ID: 1370014 shown on your account today. It has no tasks, but the strikingly-similar computer ID: 1362952 has 8 tasks in progress. If these are the same computer, you should check that your computer's client_state.xml has <hostid> 1362952. If not, re-restore the tasks, and correct both the <hostid> and the <rpc_seqno> as above before resuming computation. |
Send message Joined: 17 Feb 06 Posts: 89 Credit: 4,309,159 RAC: 0 |
Thanks for the reply Richard. I looked at the computers on my account and realised that this morning, before I restored the backup, I had updated the graphics drivers from 340.76 to 346.59. The restored client_state.xml was expecting 340.76 and when it saw 346.59 it must have thought this is a new machine and set up a new hostid automatically. Now that I have a reason, I am not too bothered if Boinc thinks its on a new machine given that I'll probably be upgrading to new Ubuntu versions in future. Before I read your post my machine had already contacted the server twice...and its done a day's crunching so I think it probably safe to leave things as they are for the moment...but I'll keep an eye on things. Thanks for your help. |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,718,239 RAC: 8,054 |
Thanks for the reply Richard. I looked at the computers on my account and realised that this morning, before I restored the backup, I had updated the graphics drivers from 340.76 to 346.59. The restored client_state.xml was expecting 340.76 and when it saw 346.59 it must have thought this is a new machine and set up a new hostid automatically. No, please read what I said. I's not your changed graphics driver that matters, it's the <rpc_seqno>, or "Remote Procedure Call - sequence number". When you restored from backup, you must have re-imported an old sequence number. The BOINC server software sees that as at attempt at cheating - trying to boost Recent Average Credit by using two computers in parallel - and responds by assigning a new Host ID. Looking at your computers on the website, I see Computer ID ... Last contact - implying that the computer now running is using the wrong ID number. That will likely invalidate your running tasks if allowed to continue. Now that you've got this far, it might be easiest to merge the two host records so the tasks are assigned to the correct host before they are completed. |
Send message Joined: 17 Feb 06 Posts: 89 Credit: 4,309,159 RAC: 0 |
Yes that makes sense. So its the <rpc_seqno> that raises a flag with the server... that confirms the point you made: "For peace of mind when restoring from backups, increase the <rpc_seqno> for the project(s) you are restoring in client_state.xml, to a figure greater than the "Number of times client has contacted server" shown in the computer details for the same machine - edit the file before you restart BOINC." I'll try and remember that for the future. I decided to 'merge computers by name' from my web account panel and this took about 10 seconds and came back confirming what it had done. It listed some names from years gone by that I had forgotten about...:) Hopefully this has resolved the issue. Thanks for the advice. |
©2024 cpdn.org