climateprediction.net home page
Transition from hadsm 4.12 to 4.13 failed

Transition from hadsm 4.12 to 4.13 failed

Questions and Answers : Windows : Transition from hadsm 4.12 to 4.13 failed
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 14628 - Posted: 24 Jul 2005, 15:26:39 UTC

Running BOINC 4.45 and slab. Last model from 4.12 hadsm finished overnight, but it would not start 4.13 hadsm without errors.

<b>From stderrdae.txt when it tried to download a new model</b>
2005-07-23 14:36:44 [climateprediction.net] Started download of hadsm3_4.13_windows_intelx86.exe
2005-07-23 14:36:44 [climateprediction.net] Started download of hadsm3data_4.13_windows_intelx86.zip
2005-07-23 14:46:47 [climateprediction.net] Temporarily failed download of hadsm3_4.13_windows_intelx86.exe: -182
2005-07-23 14:46:47 [climateprediction.net] Backing off 1 minutes and 0 seconds on download of file hadsm3_4.13_windows_intelx86.exe
2005-07-23 14:46:47 [climateprediction.net] Temporarily failed download of hadsm3data_4.13_windows_intelx86.zip: -182
2005-07-23 14:46:47 [climateprediction.net] Backing off 1 minutes and 0 seconds on download of file hadsm3data_4.13_windows_intelx86.zip
2005-07-23 14:46:47 [climateprediction.net] Started download of hadsm3se_4.13_windows_intelx86.zip
2005-07-23 14:46:47 [climateprediction.net] Started download of hadsm3um_4.13_windows_intelx86.zip
2005-07-23 14:46:52 [climateprediction.net] Finished download of hadsm3se_4.13_windows_intelx86.zip
2005-07-23 14:46:52 [climateprediction.net] Throughput 187880 bytes/sec
2005-07-23 14:46:52 [climateprediction.net] Started download of 129e_100070105.zip
2005-07-23 14:46:53 [climateprediction.net] Finished download of 129e_100070105.zip
2005-07-23 14:46:53 [climateprediction.net] Throughput 17371 bytes/sec
2005-07-23 14:46:55 [climateprediction.net] Finished download of hadsm3um_4.13_windows_intelx86.zip
2005-07-23 14:46:55 [climateprediction.net] Throughput 302379 bytes/sec
2005-07-23 14:47:48 [climateprediction.net] Started download of hadsm3_4.13_windows_intelx86.exe
2005-07-23 14:47:48 [climateprediction.net] Started download of hadsm3data_4.13_windows_intelx86.zip
2005-07-23 14:47:48 [climateprediction.net] Unrecoverable error for result 129e_100070105_1 (app_version download error: couldn\'t get input files:
file_xfer_error
file_namehadsm3_4.13_windows_intelx86.exe/file_name
error_code-200/error_code
error_message/error_message
/file_xfer_error
file_xfer_error
file_namehadsm3data_4.13_windows_intelx86.zip/file_name
error_code-200/error_code
error_message/error_message
/file_xfer_error
)
2005-07-23 14:47:49 [climateprediction.net] Deferring communication with project for 58 seconds

<b>Then, after the hadsm 4.12 model finished, the following errors were thrown</b>
2005-07-24 02:40:54 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:40:55 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:40:55 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:40:56 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:40:56 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:40:56 [climateprediction.net] Unrecoverable error for result 11vh_100069599_1 (CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20))
2005-07-24 02:40:56 [climateprediction.net] Deferring communication with project for 57 seconds
2005-07-24 02:42:50 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:42:50 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:42:51 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:42:52 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:42:52 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:42:52 [climateprediction.net] Unrecoverable error for result 1283_100070057_1 (CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20))
2005-07-24 02:42:52 [climateprediction.net] Deferring communication with project for 57 seconds
2005-07-24 02:43:54 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:43:54 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:43:55 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:43:56 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:43:56 [climateprediction.net] CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
2005-07-24 02:43:57 [climateprediction.net] Unrecoverable error for result 12fj_100070328_1 (CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20))
2005-07-24 02:43:57 [climateprediction.net] Deferring communication with project for 57 seconds
2005-07-24 02:44:56 [climateprediction.net] Message from server: No work sent
2005-07-24 02:44:56 [climateprediction.net] Message from server: (reached daily quota of 2 results)
2005-07-24 03:43:57 [climateprediction.net] Deferring communication with project for 15 hours, 35 minutes, and 9 seconds
2005-07-24 04:43:58 [climateprediction.net] Deferring communication with project for 14 hours, 35 minutes, and 9 seconds
2005-07-24 05:43:58 [climateprediction.net] Deferring communication with project for 13 hours, 35 minutes, and 8 seconds
2005-07-24 06:43:59 [climateprediction.net] Deferring communication with project for 12 hours, 35 minutes, and 7 seconds
2005-07-24 07:43:59 [climateprediction.net] Deferring communication with project for 11 hours, 35 minutes, and 7 seconds
2005-07-24 08:44:00 [climateprediction.net] Deferring communication with project for 10 hours, 35 minutes, and 6 seconds
2005-07-24 09:44:00 [climateprediction.net] Deferring communication with project for 9 hours, 35 minutes, and 6 seconds

<b>Then, when I tried to reset the project, it gave the following errors</b>
2005-07-24 10:08:16 [climateprediction.net] Couldn\'t delete file projects/climateprediction.net/hadsm3_4.13_windows_intelx86.exe
2005-07-24 10:08:18 [climateprediction.net] Couldn\'t delete file projects/climateprediction.net/hadsm3data_4.13_windows_intelx86.zip

I\'ve now closed BOINC and reset the project without the above error, but of course it is waiting for the next day to try to download a new model/work.
ID: 14628 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14631 - Posted: 24 Jul 2005, 16:55:01 UTC

I would like to gauge how serious this problem is. Are lots of people getting these messages:

Unrecoverable error for result 1283_100070057_1 (CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20))

Also has anyone successfully downloaded Windows hadsm3*_4.13_* files?

_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Climateprediction_FAQ">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14631 · Report as offensive     Reply Quote
Arnaud

Send message
Joined: 3 Sep 04
Posts: 268
Credit: 256,045
RAC: 0
Message 14632 - Posted: 24 Jul 2005, 18:21:12 UTC
Last modified: 24 Jul 2005, 18:51:24 UTC

&gt;Also has anyone successfully downloaded Windows hadsm3*_4.13_* files?

Yeap, I have.
Just after Carl closed the Beta, I detached from beta and reseted my 4.12 model to have a 4.13 (just to test that everything went fine).
The model and the applications were downloaded successfully.
The model and the viz are working fine (just a few hours crunched but no problem noticed)
BOINC 4.49 XPSP1.

EDIT: just found the log :

2005-07-20 20:48:49 [climateprediction.net] Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-07-20 20:48:49 [climateprediction.net] Requesting 8640 seconds of work, returning 0 results
2005-07-20 20:48:50 [climateprediction.net] Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2005-07-20 20:48:51 [climateprediction.net] Started download of hadsm3_4.13_windows_intelx86.exe
2005-07-20 20:48:51 [climateprediction.net] Started download of globe.rgb
2005-07-20 20:48:57 [climateprediction.net] Finished download of hadsm3_4.13_windows_intelx86.exe
2005-07-20 20:48:57 [climateprediction.net] Throughput 412754 bytes/sec
2005-07-20 20:48:57 [climateprediction.net] Finished download of globe.rgb
2005-07-20 20:48:57 [climateprediction.net] Throughput 307939 bytes/sec
2005-07-20 20:48:57 [climateprediction.net] Started download of globe.tga
2005-07-20 20:48:57 [climateprediction.net] Started download of hadsm3data_4.13_windows_intelx86.zip
2005-07-20 20:49:09 [climateprediction.net] Finished download of hadsm3data_4.13_windows_intelx86.zip
2005-07-20 20:49:09 [climateprediction.net] Throughput 334117 bytes/sec
2005-07-20 20:49:09 [climateprediction.net] Started download of hadsm3se_4.13_windows_intelx86.zip
2005-07-20 20:49:12 [climateprediction.net] Finished download of globe.tga
2005-07-20 20:49:12 [climateprediction.net] Throughput 418561 bytes/sec
2005-07-20 20:49:12 [climateprediction.net] Started download of hadsm3um_4.13_windows_intelx86.zip
2005-07-20 20:49:13 [climateprediction.net] Finished download of hadsm3se_4.13_windows_intelx86.zip
2005-07-20 20:49:13 [climateprediction.net] Throughput 226873 bytes/sec
2005-07-20 20:49:13 [climateprediction.net] Started download of 0wao_100062298.zip
2005-07-20 20:49:14 [climateprediction.net] Finished download of 0wao_100062298.zip
2005-07-20 20:49:14 [climateprediction.net] Throughput 70567 bytes/sec
2005-07-20 20:49:16 [climateprediction.net] Finished download of hadsm3um_4.13_windows_intelx86.zip
2005-07-20 20:49:16 [climateprediction.net] Throughput 606213 bytes/sec
2005-07-20 20:49:16 [---] request_reschedule_cpus:<b> files downloaded</b>
2005-07-20 20:49:16 [climateprediction.net]<b> Starting result 0wao_100062298_1 using hadsm3 version 4.13</b>

-----------------------------------------------
<a href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page">Boinc Wiki</a>
<a href="http://forum.boinc.fr/">L'Alliance Francophone</a>
ID: 14632 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14633 - Posted: 24 Jul 2005, 19:10:22 UTC

Arnaud are you willing to make the files available to people like Geophi, Helene Ryding and Zydor who seem to have been hit with this problem?

If so what is the best way to do this - an ftp where anyone can change the files to some malware seems a bit open to abuse?
_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Climateprediction_FAQ">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14633 · Report as offensive     Reply Quote
Profile Pete B

Send message
Joined: 26 Aug 04
Posts: 67
Credit: 10,125,583
RAC: 3,335
Message 14635 - Posted: 24 Jul 2005, 20:04:48 UTC - in response to Message 14631.  
Last modified: 24 Jul 2005, 20:05:53 UTC

Hi there

crandles asked:

"Also has anyone successfully downloaded Windows hadsm3*_4.13_* files?"

Yes, unintentionally so in my case. When I re-set Amy back to "use 2 CPU's for BOINC" on Friday, ready to set up the Sulphur Beta test, I successfully downloaded and started a Sulphur model, then suspended the already running HadSM3 4.12 model on the std project just to see how fast the Sulphur Beta would run on its own. The response was for BOINC to think it was short of work and download a new HadSM3 4.13 and start running it. I immediately suspended it and so down came another one the same. I could only stop by setting both projects to download no more work and suspend BOINC network access so I could suspend the std model without new ones coming down. I aborted the susbequent downloads to prevent having too many waiting to run in future but the first to download is held in suspension for the time being until the existing std run is finished.

Although the files show the deliberate aborting via the GUI "error", the download info is there as well if required by anyone.

Pete
ID: 14635 · Report as offensive     Reply Quote
Arnaud

Send message
Joined: 3 Sep 04
Posts: 268
Credit: 256,045
RAC: 0
Message 14636 - Posted: 24 Jul 2005, 20:07:16 UTC
Last modified: 25 Jul 2005, 5:39:27 UTC

Ok.
Try this <a href="http://arnaudboinc.free.fr/index">link</a> to download the 4.13 apps
Thanks for the feedback.
-----------------------------------------------
<a href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page">Boinc Wiki</a>
<a href="http://forum.boinc.fr/">L'Alliance Francophone</a>
ID: 14636 · Report as offensive     Reply Quote
old_user272

Send message
Joined: 6 Aug 04
Posts: 58
Credit: 1,286,603
RAC: 0
Message 14638 - Posted: 25 Jul 2005, 5:50:43 UTC - in response to Message 14631.  


&gt; Also has anyone successfully downloaded Windows hadsm3*_4.13_* files?

I finished a 4.12 model yesterday. The 4.13 files and a model were downloaded and started without any problem.

Ian
<img src='http://www.boincsynergy.com/images/stats/comb-942.jpg'>
ID: 14638 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14658 - Posted: 26 Jul 2005, 10:40:01 UTC

Time for a new set of questions about this:

1. If you have had this problem, have you solved it? and 1b) how?

2. Are there any similarities between systems that have had the problem. eg multiprocessor, running 2 slab models, running as service etc. ?

If the only solve method is a reset, this is a problem if you have 2 or more models running, so

3. Is there any way to solve it when you still have a model running?

4. Has anyone downloaded files from Arnaud's link? 4b) If so, does this help or does BOINC reject the files because the signatures haven't been checked?

_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Climateprediction_FAQ">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14658 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 14659 - Posted: 26 Jul 2005, 12:25:40 UTC
Last modified: 26 Jul 2005, 12:30:03 UTC

No problem downloading 4.13 to a P4 HT Win XP SP2 system running BOINC 4.19 as a service.
System is running 2 4.12 models with one due to complete this weekend.

The error messages that geophi posted suggest a BOINC problem to me.

&gt; 2005-07-23 14:46:47 [climateprediction.net] Temporarily failed download of hadsm3_4.13_windows_intelx86.exe: -182
&gt; 2005-07-23 14:46:47 [climateprediction.net] Backing off 1 minutes and 0 seconds on download of file hadsm3_4.13_windows_intelx86.exe
&gt; 2005-07-23 14:46:47 [climateprediction.net] Temporarily failed download of hadsm3data_4.13_windows_intelx86.zip: -182
&gt; 2005-07-23 14:46:47 [climateprediction.net] Backing off 1 minutes and 0 seconds on download of file hadsm3data_4.13_windows_intelx86.zip

These messages indicate that BOINC timed out on downloading hadsm3_4.13_windows_intelx86.exe and hadsm3data_4.13_windows_intelx86.zip

&gt; 2005-07-23 14:47:48 [climateprediction.net] Started download of hadsm3_4.13_windows_intelx86.exe
&gt; 2005-07-23 14:47:48 [climateprediction.net] Started download of hadsm3data_4.13_windows_intelx86.zip
&gt; 2005-07-23 14:47:48 [climateprediction.net] Unrecoverable error for result 129e_100070105_1
&gt; (app_version download error: couldn't get input files:
&gt; file_xfer_error
&gt; file_namehadsm3_4.13_windows_intelx86.exe/file_name
&gt; error_code-200/error_code
&gt; error_message/error_message
&gt; /file_xfer_error
&gt; file_xfer_error
&gt; file_namehadsm3data_4.13_windows_intelx86.zip/file_name
&gt; error_code-200/error_code
&gt; error_message/error_message
&gt; /file_xfer_error
&gt; )

BOINC has attempted to start the result despite knowing that it has just requested download of 2 files required to run it!

&gt; <b>Then, after the hadsm 4.12 model finished, the following errors were thrown</b>
&gt; 2005-07-24 02:40:54 [climateprediction.net] CreateProcess() failed -
&gt; The process cannot access the file because it is being used by another process. (0x20)
&gt;
&gt; ... cut ...
&gt;
&gt; <b>Then, when I tried to reset the project, it gave the following errors</b>
&gt; 2005-07-24 10:08:16 [climateprediction.net] Couldn't delete file projects/climateprediction.net/hadsm3_4.13_windows_intelx86.exe
&gt; 2005-07-24 10:08:18 [climateprediction.net] Couldn't delete file projects/climateprediction.net/hadsm3data_4.13_windows_intelx86.zip

Both of these suggest that BOINC might have kept the files it timed out in an opened state, preventing them from being overwritten or deleted. The only way to get round that is to stop and restart BOINC (as geophi found).
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 14659 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 14660 - Posted: 26 Jul 2005, 13:29:16 UTC - in response to Message 14658.  
Last modified: 26 Jul 2005, 13:31:55 UTC

&gt; 1. If you have had this problem, have you solved it? and 1b) how?

The reset after shutting down and restarting I described in the first post in this thread worked.

&gt; 2. Are there any similarities between systems that have had the problem. eg
&gt; multiprocessor, running 2 slab models, running as service etc. ?

The system in the first post is an Athlon64, single proc, running WinXP Pro, no other projects.

&gt; 3. Is there any way to solve it when you still have a model running?

Not to my knowledge, however, since I got an error when it downloaded a new parameter set (and 4.13) several hours before the 4.12 model completion, I suppose if that happened to someone, one could at that point exit and restart BOINC. It may or may not help, but as Thyme pointed out, it would at least have unlocked the two files that aborted in download.

&gt; 4. Has anyone downloaded files from Arnaud's link? 4b) If so, does this help
&gt; or does BOINC reject the files because the signatures haven't been checked?

I haven't.
ID: 14660 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 14663 - Posted: 26 Jul 2005, 16:05:20 UTC - in response to Message 14660.  
Last modified: 26 Jul 2005, 16:08:37 UTC

oh I see, so now in BOINC (with their new scheduling stuff) when if sees that you're finishing (even if it's hours away from finished), it will get the next workunit or two, and at that time also get any new versions of the project's software. I think before it was just when the workunit was finished, and then you would get the new software.

there seems to be some bug about having your old cpdn software running as it's trying to get the new one. the files are all named differently, except now Tolu has the graphic maps (globe.rgb &amp; globe.tga) downloaded separately and with the same name, so perhaps that was the confusion. we'll have to try to duplicate this trouble in-house and see if we can put the graphics back into one of the zips (the problem is now that boinc's graphics startup before getting to unzip our stuff, but maybe we can reshuffle things).

by the way, geophi, was this perhaps a box used on the hadsm3_4.13 beta test I ran a few days? perhaps there's a reference to the 4.13 from the other server hanging around (the files &amp; checksums are the same, but maybe the client is confused seeing the same name come in again).
ID: 14663 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 14666 - Posted: 26 Jul 2005, 17:31:40 UTC - in response to Message 14663.  

&gt; by the way, geophi, was this perhaps a box used on the hadsm3_4.13 beta test I
&gt; ran a few days? perhaps there's a reference to the 4.13 from the other server
&gt; hanging around (the files &amp; checksums are the same, but maybe the client
&gt; is confused seeing the same name come in again).

Nope Carl. This was not the PC I tested 4.13 beta on.

On this PC I had upgraded BOINC from 4.19 to 4.45 the previous day (uninstalled 4.19 before installing 4.45, figuring 4.19 might not work well with hadsm 4.13) without any issues with the running 4.12 model.
ID: 14666 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14677 - Posted: 27 Jul 2005, 9:25:54 UTC

Thanks to Thyme Lawn:

<b> Solutions</b>

<b>The simple solution</b>
Exit BOINC and restart it.

If you have already reached your quota, you will have to wait for the next day for it to make another attempt to download 4.13.

<b>The more complex solution</b>
If you don't want to run the risk of the next attempt also failing (or you tried the simple solution and it didn't work).

1. Exit BOINC
2. Download 4.13 files from Arnaud link in this thread.
3. Place these files in www.climateprediction.net folder, and
4. Restart BOINC

Whenever you download files from the internet and run them, you are taking a risk. Arnaud has been around for a long time on these forums and I would trust them to be OK but that is just my opinion and not any sort of guarantee.
_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Climateprediction_FAQ">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14677 · Report as offensive     Reply Quote
old_user44916

Send message
Joined: 28 Jan 05
Posts: 2
Credit: 73,043
RAC: 0
Message 14707 - Posted: 29 Jul 2005, 15:18:55 UTC

I do have the same problem. Seems after having boinc to do his things for one week or so, an error appear with some run-time error in some fortran modules. After closing boinc and reseting my computer, I can see that all the project work is lost. This time I have lost about 10 % of each of the two projects I ran. If I could, I would return to the older one, this one really SUCKS!!
ID: 14707 · Report as offensive     Reply Quote

Questions and Answers : Windows : Transition from hadsm 4.12 to 4.13 failed

©2024 cpdn.org