climateprediction.net home page
hadsm3_4.10_windows_intelx86.exe has encountered a problem and needs to close.

hadsm3_4.10_windows_intelx86.exe has encountered a problem and needs to close.

Questions and Answers : Windows : hadsm3_4.10_windows_intelx86.exe has encountered a problem and needs to close.
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 12851 - Posted: 25 May 2005, 0:27:18 UTC

"We are sorry for the inconvenience". Indeed!

The WU has completed 99.99% (863:33:57 - 00:03:27 left to completion), and the application crashes on me! These are the details:
AppName: hadsm3_4.10_windows_intelx86.exe
AppVer: 0.0.0.0
ModName: ntdll.dll
ModVer: 5.1.2600.2180
Offset: 00011f6e
BOINC version was 4.36
ID: 12851 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 12853 - Posted: 25 May 2005, 0:39:14 UTC

After I clicked on the "Don't Send" (to Microsoft) button, I got these additional information:

forrtl: severe (24): end-of -file during read, unit 5, file C:\Progrom Files\BOINC\projects\climateprediction.net\3l9b_200189203\jobs\climate.cpdc
Image ----------- PC ----- Routine Line -- Source
hadsm3um_4.12_win 008C765B Unknown Unknown Unknown
hadsm3um_4.12_win 008B132A Unknown Unknown Unknown
hadsm3um_4.12_win 008B0039 Unknown Unknown Unknown
hadsm3um_4.12_win 008B0564 Unknown Unknown Unknown
hadsm3um_4.12_win 0089DFFB Unknown Unknown Unknown
hadsm3um_4.12_win 0040790A Unknown Unknown Unknown
kernel32.dll 7C816D4F Unknown Unknown Unknown

Seems that new WU immediately crashed after it started.

I have a save point from the old WU at about 98%. If I'd restore that, install BOINC 4.43 - is there a chance that I could successfully complete it?
ID: 12853 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 12855 - Posted: 25 May 2005, 2:01:37 UTC

Aaargh! I had one of these about a month ago. Mine was actually in end of phase processing when it gave that error code (-1073741819). It looks like yours was too as it uploaded the last trickle in the run. Extremely frustrating. I'd rerun the last 2% to see if you can get past end of phase this time. Make sure you are connected to the net when it is ready to communicate and upload, and perhaps pause the run and defrag your disk before it reaches end of phase.

When it uploads correctly, it if does, it may not change the result status on your results page, but the data would be available to the investigators. Good luck!
ID: 12855 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 12875 - Posted: 26 May 2005, 1:11:24 UTC

Re-ran these last 2%, and found it again crashed this morning. I didn't actually check the error code yesterday, but today I saw that it was indeed again the dreaded -5 error!

Checked my statistics (<A HREF="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=30846">http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=30846</A>) and found that out of 13 WUs only two had actually completed! This is really a very poor success rate. And the fact that the application now crashed twice with the same data at the same processing point indicates to me that something is wrong with the application, and not with my hardware, as so many times is stressed when the -5 error appears.

The good news is that the next WU that crashed yesterday with a severe forrtl error is running fine now.
ID: 12875 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 12876 - Posted: 26 May 2005, 1:23:56 UTC
Last modified: 26 May 2005, 1:29:10 UTC

If it helps in any way to diagnose the problem, here are the messages from the period when the error occurred the second time.

2005-05-25 14:32:54||Starting BOINC client version 4.43 for windows_intelx86
2005-05-25 14:32:54||Data directory: C:\Program Files\BOINC
2005-05-25 14:32:54|climateprediction.net|Computer ID: 30846; location: ; project prefs: default
2005-05-25 14:32:54|SETI@home|Computer ID: 867134; location: work; project prefs: default
2005-05-25 14:32:54||General prefs: from SETI@home (last modified 2004-10-13 12:01:36)
2005-05-25 14:32:54||General prefs: no separate prefs for work; using your defaults
2005-05-25 14:32:54||Remote control not allowed; using loopback address
2005-05-25 14:33:10|climateprediction.net|Deferring computation for result 17lu_000077101_1
2005-05-25 14:33:10|SETI@home|Deferring computation for result 04fe05aa.10289.28994.267324.180_0
2005-05-25 14:33:10||Resuming computation and network activity
2005-05-25 14:33:10||schedule_cpus: must schedule
2005-05-25 14:33:11|climateprediction.net|Restarting result 17lu_000077101_1 using hadsm3 version 4.10
2005-05-25 15:33:10||schedule_cpus: time 3600.015625
2005-05-25 16:33:10||schedule_cpus: time 3600.055744
2005-05-25 17:33:10||schedule_cpus: time 3600.019131
2005-05-25 17:33:10|climateprediction.net|Pausing result 17lu_000077101_1 (removed from memory)
2005-05-25 17:33:11|SETI@home|Restarting result 04fe05aa.10289.28994.267324.180_0 using setiathome version 4.09
2005-05-25 17:33:14||request_reschedule_cpus: process exited
2005-05-25 17:33:14||schedule_cpus: must schedule
2005-05-25 18:33:14||schedule_cpus: time 3600.005159
2005-05-25 18:33:14|climateprediction.net|Restarting result 17lu_000077101_1 using hadsm3 version 4.10
2005-05-25 18:33:14|SETI@home|Pausing result 04fe05aa.10289.28994.267324.180_0 (removed from memory)
2005-05-25 18:33:15||request_reschedule_cpus: process exited
2005-05-25 18:33:15||schedule_cpus: must schedule
2005-05-25 19:33:15||schedule_cpus: time 3600.035610
2005-05-25 20:33:15||schedule_cpus: time 3600.003874
2005-05-25 21:33:15||schedule_cpus: time 3600.015821
2005-05-25 21:33:15|climateprediction.net|Pausing result 17lu_000077101_1 (removed from memory)
2005-05-25 21:33:15|SETI@home|Restarting result 04fe05aa.10289.28994.267324.180_0 using setiathome version 4.09
2005-05-25 21:33:30||request_reschedule_cpus: process exited
2005-05-25 21:33:30||schedule_cpus: must schedule
2005-05-25 22:33:30||schedule_cpus: time 3600.020432
2005-05-25 22:33:30|climateprediction.net|Restarting result 17lu_000077101_1 using hadsm3 version 4.10
2005-05-25 22:33:30|SETI@home|Pausing result 04fe05aa.10289.28994.267324.180_0 (removed from memory)
2005-05-25 22:33:30||request_reschedule_cpus: process exited
2005-05-25 22:33:30||schedule_cpus: must schedule
2005-05-25 23:33:30||schedule_cpus: time 3600.031965
2005-05-26 00:33:30||schedule_cpus: time 3600.012373
2005-05-26 01:33:30||schedule_cpus: time 3600.078413
2005-05-26 01:33:30|climateprediction.net|Pausing result 17lu_000077101_1 (removed from memory)
2005-05-26 01:33:31|SETI@home|Restarting result 04fe05aa.10289.28994.267324.180_0 using setiathome version 4.09
2005-05-26 01:33:33||request_reschedule_cpus: process exited
2005-05-26 01:33:33||schedule_cpus: must schedule
2005-05-26 02:33:33||schedule_cpus: time 3600.027842
2005-05-26 02:33:33|climateprediction.net|Restarting result 17lu_000077101_1 using hadsm3 version 4.10
2005-05-26 02:33:33|SETI@home|Pausing result 04fe05aa.10289.28994.267324.180_0 (removed from memory)
2005-05-26 02:33:34||request_reschedule_cpus: process exited
2005-05-26 02:33:34||schedule_cpus: must schedule
2005-05-26 03:33:34||schedule_cpus: time 3600.012226
2005-05-26 04:33:34||schedule_cpus: time 3600.059076
2005-05-26 05:10:07|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-05-26 05:10:29|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi failed
2005-05-26 05:10:29|climateprediction.net|No schedulers responded
2005-05-26 05:10:29|climateprediction.net|Deferring communication with project for 1 minutes and 0 seconds
2005-05-26 05:11:30|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-05-26 05:11:53|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi failed
2005-05-26 05:11:53|climateprediction.net|No schedulers responded
2005-05-26 05:11:53|climateprediction.net|Deferring communication with project for 1 minutes and 0 seconds
2005-05-26 05:12:54|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-05-26 05:13:16|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi failed
2005-05-26 05:13:16|climateprediction.net|No schedulers responded
2005-05-26 05:13:16|climateprediction.net|Deferring communication with project for 1 minutes and 0 seconds
2005-05-26 05:14:18|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-05-26 05:15:21|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2005-05-26 05:15:28|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-05-26 05:16:30|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2005-05-26 05:28:22|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-05-26 05:29:24|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2005-05-26 05:33:34||schedule_cpus: time 3600.074697
2005-05-26 05:33:34|climateprediction.net|Pausing result 17lu_000077101_1 (removed from memory)
2005-05-26 05:33:34|SETI@home|Restarting result 04fe05aa.10289.28994.267324.180_0 using setiathome version 4.09
2005-05-26 06:33:34||schedule_cpus: time 3600.106119
2005-05-26 06:33:34|SETI@home|Pausing result 04fe05aa.10289.28994.267324.180_0 (removed from memory)
2005-05-26 06:33:39||request_reschedule_cpus: process exited
2005-05-26 06:33:39||schedule_cpus: must schedule
2005-05-26 07:33:39||schedule_cpus: time 3600.026260
2005-05-26 08:33:39||schedule_cpus: time 3600.033154
2005-05-26 09:33:40||schedule_cpus: time 3600.011444
2005-05-26 09:33:40|climateprediction.net|Pausing result 17lu_000077101_1 (removed from memory)
2005-05-26 09:33:40|SETI@home|Restarting result 04fe05aa.10289.28994.267324.180_0 using setiathome version 4.09
2005-05-26 10:15:54|climateprediction.net|Unrecoverable error for result 17lu_000077101_1 ( - exit code -1073741819 (0xc0000005))
2005-05-26 10:15:54||request_reschedule_cpus: process exited
2005-05-26 10:15:54|climateprediction.net|Deferring communication with project for 59 seconds
2005-05-26 10:15:54|climateprediction.net|Computation for result 17lu_000077101_1 finished
2005-05-26 10:15:54||schedule_cpus: must schedule
ID: 12876 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 12880 - Posted: 26 May 2005, 6:10:29 UTC

If you have a backup from just before the crash, you can move the model to a different machine and try to finish the last few minutes there.

CPDN crashed repeatedly on one of my machines on trickle 24 (end of phase1) with a lot of null bytes in the result files where data should be. Same exit code as yours.

There is something wrong with the "big trickles" that doesn't happen on all machines - maybe even just sleeping a few microseconds between all those fopen / fclose things could fix it so the file buffers can be flushed properly.

One of those I saved just seconds before the crash and moved it, it's a full run now, no trouble from that other machine :-)


And before anyone tries to tell me that it's my machine : it's prime stable, has passed memtest, isn't OCed, doesn't have a heat problem (2 P3s Tualatin) and runs unattended for months without any flaws - so forget it, it isn't the machine.
ID: 12880 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 12892 - Posted: 26 May 2005, 14:43:04 UTC - in response to Message 12880.  

&gt; And before anyone tries to tell me that it's my machine : it's prime stable,
&gt; has passed memtest, isn't OCed, doesn't have a heat problem (2 P3s Tualatin)
&gt; and runs unattended for months without any flaws - so forget it, it isn't the
&gt; machine.
&gt;
I don't think people are associating that error number much with unstable hardware. Too many stable systems are now getting this error, especially (but not exclusively) at phase end. Why some PCs appear more susceptible to that error, I don't know. The PC I am having this problem on is stable in all tests (ran long periods), including HD tests. It has to be some relatively recent thing in boinc, hadsm, or MS patches that only affects certain hardware or software configurations. Kind of wide open at that. But your idea about flushing buffers makes sense given it appears during periods of intense HD activity.
ID: 12892 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 12944 - Posted: 29 May 2005, 23:59:06 UTC - in response to Message 12880.  

&gt; If you have a backup from just before the crash, you can move the model to a
&gt; different machine and try to finish the last few minutes there.

I do have a backup, and I do have another PC that could try to complete the job. However, I wouldn't know how to transfer that to another PC to do that. How would I go about that?

Install BOINC first?
Create a new host?
Replace what data from the original host?
ID: 12944 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 12945 - Posted: 30 May 2005, 0:48:49 UTC - in response to Message 12944.  
Last modified: 30 May 2005, 0:50:04 UTC

&gt; I do have a backup, and I do have another PC that could try to complete the
&gt; job. However, I wouldn't know how to transfer that to another PC to do that.
&gt; How would I go about that?
&gt;
&gt; Install BOINC first?
&gt; Create a new host?
&gt; Replace what data from the original host?
&gt;
&gt;
This is how I would do it...
1 Copy the entire BOINC folder/directory structure that you had backed up (before the error) to the new hard drive in the same location (C:BOINC or whevever it was). If you have to copy to CD/DVD to move it over, I would zip the entire folder (with subfolders) up first. Otherwise copying the directory structure to CD/DVD, then copying it back to another hard drive will give all the files write protection, which you don't want.
2 Run the installation of the same version of BOINC you were running before, and install it to the same directory you copied BOINC over to.
3 It should pick up from the point you had a backup.

Good luck.
ID: 12945 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 12963 - Posted: 31 May 2005, 3:03:04 UTC

Thanks for the advice!
ID: 12963 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 13399 - Posted: 13 Jun 2005, 2:31:58 UTC - in response to Message 12963.  
Last modified: 13 Jun 2005, 2:40:45 UTC

I have finally had the chance to load the saved WU to a new system, and run the last 2%. Result: the same - it crashes at 99.99%.

I reloaded the saved result again and installed BOINC 4.43. Re-ran it again, with the same result: crash at 99.99%.

It looks to me that this bug is highly reproducible. I am a software developer, and if I had a reproducible bug like this in my software, I would (probably) be able to debug and fix it.

CPDN developers: are you interested in my saved, 98% complete WU? It runs just 7 hours until the crash.

[Edited to add:] both original and new system run Win XP on Intel P4 processors (non-OC).
ID: 13399 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 13411 - Posted: 13 Jun 2005, 11:33:48 UTC - in response to Message 13399.  
Last modified: 13 Jun 2005, 15:42:32 UTC

&gt; It looks to me that this bug is highly reproducible.

Yes. It seems to be a common experience that if you get it then it will probably recur, though it is always worth trying from a backup. Unfortunately, it may well happen on the next WU as well. Funny thing is, in my case it didn't begin immediately I moved above BOINC 4.19, nor does it go away if I drop back to that version. Nor did it happen the first time I ran Hadsm3 4.12. But it has now affected both my P4 machines running Win XP, so the cause is common to both. Added to that is the evidence that problem relates to an access error during file handling.

My ignorance of these things is profound, but I suspect that it relates to my network security. I'm running off a Netgear router (using WiFi for one of the machines). I also have NAV. I've tried the obvious, but I'm at a loss to think what else could be affecting both PCs which have otherwise little in common.
ID: 13411 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 13426 - Posted: 14 Jun 2005, 2:08:11 UTC - in response to Message 13411.  

&gt; ... I suspect that it relates to my network security.

The two systems I have used for my test are completely different. One is on a LAN, connecting to the Internet via proxy server, and using firewall and AV software.

The other one is a standalone system, without firewall or AV, connecting to the Internet directly via ADSL modem. The HD was defragmented after the 98% completed WU was loaded, and nothing was running beside BOINC and CPDN.

The only thing these systems have in common are that they are DELL Optiplex GX270 with Intel P4 processors, running Windows XP-SP2. (No overclocking, no hyperthreading).
ID: 13426 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 13429 - Posted: 14 Jun 2005, 6:54:53 UTC - in response to Message 13426.  
Last modified: 14 Jun 2005, 6:56:22 UTC


&gt; The two systems I have used for my test are completely different.
&gt;
&gt; The only thing these systems have in common are that they are DELL Optiplex
&gt; GX270 with Intel P4 processors, running Windows XP-SP2. (No overclocking, no
&gt; hyperthreading).

Bang goes that theory. And mine use different hardware to this and from each other. Which gets us nearer to solving the puzzle that some people get this error, others reportedly don't.

ID: 13429 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 13436 - Posted: 14 Jun 2005, 10:38:02 UTC
Last modified: 14 Jun 2005, 11:42:48 UTC

Can you look at the files in the dataout folders to see if there are any oddities.

????aa are normally associated with .x1.nc
????ba are normally associated with .x2.nc

Are there any files that do not follow this pattern?

For example any
????aa.p*.x2.nc files, or any
????ba.p*.x3.nc files after the crash.


(I seem to have 1 such file in all my runs: the ????aa.pc.8yac.x2.nc file. Presumably this is because the ????aa.pc.8yac file is not deleted after creating the ????aa.pc.8yac.x1.nc file. I have no idea if the phase transition might go more smoothly without this extraneous file.

I also have a ka.pc.8yac.x1, ka.pc.8yac.x2 ka.pc.8yac.x3 and a ka.pc.8yac.x4 file in a completed SC run.)

I think the instructions concerning the conversion of the pc.8yac files should be examined for bugs or at least to insert a delete.
ID: 13436 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 13437 - Posted: 14 Jun 2005, 12:17:18 UTC

I've lost track of which / what problem is being persued here now, but if it is the original one, I had one of these 'must close' things a month or so back.
It was on a 4.12 model using BOINC 4.25.

I Suspended BOINC, waited, Exited, waited, copied all the .xml files in BOINC to a 'save' directory I have for when I need to reboot, ticked 'Don't notify Microsoft', shut down the computer, waited a few seconds, and then powered up again.
Windows started, a few progs appeared in the System Tray, I killed the AV and Spybot resident parts, started BOINC Manager, and in a few more seconds the two models were ticking away again.

I don't know if this helps, or just irritates.




ID: 13437 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 13454 - Posted: 14 Jun 2005, 23:22:58 UTC
Last modified: 14 Jun 2005, 23:23:36 UTC

I could not see anything of the sort Crandles mentioned, but found some errors in stderr_um.txt. I've just quoted the end of the file:

OPEN: File dataout/2qbhba.da40bp0 Created on Unit 22
OPEN: File dataout/2qbhba.da40bs0 Created on Unit 22
OPEN: File dataout/2qbhba.da40c10 Created on Unit 22
CLOSE: WARNING: Unit 60 Not Opened
OPEN: File dataout/2qbhba.pa41c10 Created on Unit 60
CLOSE: WARNING: Unit 63 Not Opened
OPEN: File dataout/2qbhba.pd41c10 Created on Unit 63
CLOSE: WARNING: Unit 64 Not Opened
OPEN: File dataout/2qbhba.pe41c10 Created on Unit 64
CLOSE: WARNING: Unit 65 Not Opened
OPEN: File dataout/2qbhba.pf41c10 Created on Unit 65
CLOSE: WARNING: Unit 66 Not Opened
OPEN: File dataout/2qbhba.pg41c10 Created on Unit 66
CLOSE: WARNING: Unit 67 Not Opened
OPEN: File dataout/2qbhba.ph41c10 Created on Unit 67

Again, pointing to a file access problem.
ID: 13454 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 13464 - Posted: 15 Jun 2005, 7:35:07 UTC

Although it looks strange those messages are normal behaviour Andrew. The warnings are generated by a bit of defensive coding and indicate that hadsm3um is trying to close files it hasn't opened yet.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 13464 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 13591 - Posted: 20 Jun 2005, 3:11:08 UTC - in response to Message 12875.  
Last modified: 20 Jun 2005, 3:11:41 UTC

&gt; Checked my statistics (<A HREF="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=30846">http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=30846</A>)
&gt; and found that out of 13 WUs only two had actually completed! This is really a very poor success rate...

I am giving up on CPDN now - my last WU has also crashed with a -5 error near 30% completion. 12 out of 14 crashed - this failure rate is way too high.
ID: 13591 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 13593 - Posted: 20 Jun 2005, 7:16:48 UTC - in response to Message 13464.  

&gt; Although it looks strange those messages are normal behaviour Andrew. The
&gt; warnings are generated by a bit of defensive coding and indicate that hadsm3um
&gt; is trying to close files it hasn't opened yet.



uh - sounds dangerous.

If it has a file handle -1 or something like that and detects the open state from the handle <b>before</b> it tries a close(), everything is fine.

If this is not the case and the error message comes from a close() call that fails, this can easily be the reason for those problems. As the handle might have been reassigned somewhere else for a different file, a file that still needs to be open (for the ZIP module for example) might get closed.


There really seems to be a problem with the file handling in those end states of 24/48/72 so it would be a good idea to revise this part of the code.

My guess has been too many open files but closing a file that is still needed might explain the problems too.
ID: 13593 · Report as offensive     Reply Quote

Questions and Answers : Windows : hadsm3_4.10_windows_intelx86.exe has encountered a problem and needs to close.

©2024 cpdn.org