climateprediction.net home page
Maybe interrupted upload - lost result?

Maybe interrupted upload - lost result?

Questions and Answers : Unix/Linux : Maybe interrupted upload - lost result?
Message board moderation

To post messages, you must log in.

AuthorMessage
MichaelB

Send message
Joined: 13 May 05
Posts: 2
Credit: 3,002,003
RAC: 4,790
Message 14143 - Posted: 6 Jul 2005, 1:19:54 UTC

Short story:
I\'m worried that I have inadvertantly interrupted the upload of the results of a work-unit. I believe I have the results archived safely, and want to know how to be sure that the CP project has actually received my results.

Long story:
Due to a muddled introduction to CPDN, I have had two units running one associated with \"www.climateprediction.net\" and another with \"climateprediction.net\". This seems messy, and I\'ve been eagerly waiting for the work-units to complete so that I could detach from one URL and continue with one work-unit at a time.

I set my preferences to allow only 0.03GB disk usage to prevent another work-unit from being down-loaded.

Last night, \"www.climateprediction.net\" (Result ID 843684) completed and appeared to finish uploading.

Six hours or so later, \"climateprediction.net\" (Result ID 844519) completed. Results did not upload, and error output showed that contact was deferred for about 23 hours.

I tarred up the directory for Result 843684 and moved it to a safe place, then ran \"boin... -detach_project www.climateprediction.net\" which completely removed that project directory. I still have the results safely archived, however.

Impatient with the deferral, I stopped boinc, edited client_state.xml to reset the next contact time to now+60 seconds, and restarted boinc. The second work-unit appeared to upload successfully, and a new work-unit was down-loaded (and is now happily crunching away.)

My account page, however, now lists only ResultID 844519 as having completed. ResultID 843684 is listed as still in progress, and I suspect I have interrupted a process that was not complete after all.

How can I complete the delivery of the work-unit output to the project?


ID: 14143 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2183
Credit: 64,822,615
RAC: 5,275
Message 14144 - Posted: 6 Jul 2005, 2:46:45 UTC

Normally if it uploads OK, the next trickle from that PC will update the status of the work unit on the server. I don't see any more trickles from that PC yet.
ID: 14144 · Report as offensive     Reply Quote
MichaelB

Send message
Joined: 13 May 05
Posts: 2
Credit: 3,002,003
RAC: 4,790
Message 14150 - Posted: 6 Jul 2005, 8:57:22 UTC - in response to Message 14144.  

That doesn't seem to be the answer: one more trickle, but no further update on my lost unit.

Will the work have been lost? The final trickle from that unit was within an hour or so of the work-unit completing.

(I'm not really interested in credits: I don't want to see six weeks of computing wasted :-( )


ID: 14150 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 14151 - Posted: 6 Jul 2005, 13:04:31 UTC

The final trickle for result id 843684 hasn't been recorded, but that doesn't necessarily indicate a problem with the result.

Check your backup of the 2roe_400150486 directory. If the result has been succesfully completed it should have a flat structure containing 367 files - 2roe_400150486.xml, 365 *.zip files and lockfile.

Check if there are any 2roe_400150486_0_n.zip (n=1 to 5) files in your BOINC directory. If there are the upload wasn't completed.

Check if any of the stdout files in the BOINC directory contain upload messages for the result files.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 14151 · Report as offensive     Reply Quote
old_user56771

Send message
Joined: 23 Feb 05
Posts: 1
Credit: 14,192
RAC: 0
Message 14231 - Posted: 9 Jul 2005, 13:44:11 UTC - in response to Message 14151.  
Last modified: 17 Jul 2005, 20:56:11 UTC

&gt; Check your backup of the 2roe_400150486 directory. If the result has been
&gt; succesfully completed it should have a flat structure containing 367 files -
&gt; 2roe_400150486.xml, 365 *.zip files and lockfile.
&gt;
&gt; Check if there are any 2roe_400150486_0_n.zip (n=1 to 5) files in your BOINC
&gt; directory. If there are the upload wasn't completed.

I have a similar problem. I checked the backup directory, there are no "_0_n.zip" files, and according to the logs the upload completed succesfully. The first trickle from the next model completed three days ago and the Outcome is still Unknown...

UPDATE ! : After the second trickle from the next model the Outcome has changed to Success :)
ID: 14231 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : Maybe interrupted upload - lost result?

©2024 cpdn.org