climateprediction.net (CPDN) home page
Thread 'Hadsm3 4.12 errors after latest Windows 2000 update'

Thread 'Hadsm3 4.12 errors after latest Windows 2000 update'

Questions and Answers : Windows : Hadsm3 4.12 errors after latest Windows 2000 update
Message board moderation

To post messages, you must log in.

AuthorMessage
DFellman

Send message
Joined: 3 Mar 05
Posts: 8
Credit: 683,785
RAC: 0
Message 13613 - Posted: 20 Jun 2005, 12:59:21 UTC

Recently, following the latest round of Microsoft security updates I think, I have been getting errors with hadsm3 4.12 runnung through Boinc 4.45. I use Windows 2000 Pro Sp4 + all security and hotfix updates applied.

The following message was in stderrgui.txt:

***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x0042F342 read attempt to address 0x00000010

1: 05/25/05 13:43:02


Unfortunately these errors have ended the model I was 40% through with a client error.

Any thoughts on how I may correct this? or is a modified version of the software required?

Thanks

Danny
ID: 13613 · Report as offensive     Reply Quote
ProfileAnanas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 13641 - Posted: 21 Jun 2005, 7:03:02 UTC
Last modified: 21 Jun 2005, 7:03:40 UTC

BOINC 4.19 runs very smooth for me on several boxes with Win2000 SP4 with (nearly) all patches.

You will be missing some features of 4.45 but a stable version is worth gold especially for the long running CPDN models.
ID: 13641 · Report as offensive     Reply Quote
DFellman

Send message
Joined: 3 Mar 05
Posts: 8
Credit: 683,785
RAC: 0
Message 13691 - Posted: 22 Jun 2005, 7:14:11 UTC - in response to Message 13641.  

> BOINC 4.19 runs very smooth for me on several boxes with Win2000 SP4 with
> (nearly) all patches.
>
> You will be missing some features of 4.45 but a stable version is worth gold
> especially for the long running CPDN models.
>
>

I have reverted to 4.25, which I was running OK, but the problem persists:

2005-06-19 12:48:38 [climateprediction.net] Unrecoverable error for result 3vxf_100203167_0 (Incorrect function. (0x1) - exit code 1 (0x1))
2005-06-19 12:48:38 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2005-06-19 12:48:47 [climateprediction.net] Unrecoverable error for result 35li_200168705_1 (Incorrect function. (0x1) - exit code 1 (0x1))
2005-06-19 12:48:47 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds

I also have 4.45 running SETI on a Win2K laptop alright.

Is this a BOINC issue or CPDN?

Danny
ID: 13691 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 13692 - Posted: 22 Jun 2005, 10:49:57 UTC

You have completed a full model so system appeared stable. Problems started on 19th June. Since you are listed from UK, I am wondering if this could be heat related?

Do you overclock at all? Also have you looked at CPU temp under full load in the current heat?
ID: 13692 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 13693 - Posted: 22 Jun 2005, 11:35:12 UTC

The only time I've ever seen exit code 1 is after the hadsm3_* controller process terminates and the hadsm3um_* worker process is left running in isolation, so it's worth checking if you've got an orphaned hadsm3um_* process.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 13693 · Report as offensive     Reply Quote
DFellman

Send message
Joined: 3 Mar 05
Posts: 8
Credit: 683,785
RAC: 0
Message 13697 - Posted: 22 Jun 2005, 13:50:54 UTC - in response to Message 13692.  

&gt; You have completed a full model so system appeared stable. Problems started on
&gt; 19th June. Since you are listed from UK, I am wondering if this could be heat
&gt; related?
&gt;
&gt; Do you overclock at all? Also have you looked at CPU temp under full load in
&gt; the current heat?
&gt;
I do not overclock the system at all, Athlon XP 3000+, and run a suitable CPU fan, power supply fan and two case fans. However the air temperature is reasonably warm over here at the moment.

Currently the CPU appears to be stable at 70 degrees C (158 F), room temperature around 25-30 degrees C.

The model I am running at the moment, under BOINC 4.25, has not caused any problems, but is under 2% complete, so lets hope it has sorted itself out.
ID: 13697 · Report as offensive     Reply Quote
DFellman

Send message
Joined: 3 Mar 05
Posts: 8
Credit: 683,785
RAC: 0
Message 13698 - Posted: 22 Jun 2005, 13:54:01 UTC - in response to Message 13693.  

&gt; The only time I've ever seen exit code 1 is after the hadsm3_* controller
&gt; process terminates and the hadsm3um_* worker process is left running in
&gt; isolation, so it's worth checking if you've got an orphaned hadsm3um_*
&gt; process.
&gt; <br><a href="http://www.teampicard.net/"><img> src="http://www.teampicard.net/images/picardmini.gif"&gt;</a><a> href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3"&gt;Join
&gt; us here</a>
&gt;

Thanks for this possibility,

task manager shows things running OK at the moment but I have seen one of the processes die in the past - ...is generating a log file... but I'm not sure which one.

The current model is still running OK under BOINC 4.25.

Danny
ID: 13698 · Report as offensive     Reply Quote
DFellman

Send message
Joined: 3 Mar 05
Posts: 8
Credit: 683,785
RAC: 0
Message 13699 - Posted: 22 Jun 2005, 14:16:50 UTC - in response to Message 13698.  

&gt; &gt; The only time I've ever seen exit code 1 is after the hadsm3_*
&gt; controller
&gt; &gt; process terminates and the hadsm3um_* worker process is left running in
&gt; &gt; isolation, so it's worth checking if you've got an orphaned hadsm3um_*
&gt; &gt; process.
&gt; &gt; <br><a href="http://www.teampicard.net/"><img>
&gt; src="http://www.teampicard.net/images/picardmini.gif"&gt;</a><a>
&gt; href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3"&gt;Join
&gt; &gt; us here</a>
&gt; &gt;
&gt;
&gt; Thanks for this possibility,
&gt;
&gt; task manager shows things running OK at the moment but I have seen one of the
&gt; processes die in the past - ...is generating a log file... but I'm not sure
&gt; which one.
&gt;
&gt; The current model is still running OK under BOINC 4.25.
&gt;
&gt; Danny
&gt;

This model has just stopped working,it will probably continue OK after a restart:

2005-06-20 14:50:33 [climateprediction.net] Unrecoverable error for result 0ku8_100047303_0 ( - exit code -5 (0xfffffffb))
2005-06-20 14:50:33 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds

The task manager shows boincmgr.exe and hadsm3_4.12_win but it looks like hadsm3um_4.12_w has died.

Danny
ID: 13699 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 13700 - Posted: 22 Jun 2005, 14:37:29 UTC

I thought Unrecoverable errors usually turned out to be umm Unrecoverable. However, a backup of the BOINC folder from prior to the message can save such situations.

70 Degrees C sounds dangerously hot for a processor to me.
ID: 13700 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 13703 - Posted: 22 Jun 2005, 15:47:15 UTC

Dust on the heatsink?
Cables blocking the airflow to the cpu?


ID: 13703 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 13705 - Posted: 22 Jun 2005, 16:19:59 UTC

70C is quite hot. What brand motherboard do you have? Some read higher than others for the same processor, but no matter what, that is very high.
ID: 13705 · Report as offensive     Reply Quote
Gareth Lock

Send message
Joined: 2 Sep 04
Posts: 51
Credit: 451,236
RAC: 0
Message 13751 - Posted: 23 Jun 2005, 5:22:51 UTC

As a system builder, I would suggest that you seriously take another look at the cooling solution you have on that CPU! 70C is way way too high. Try running the machine with the case open and look for snagging cables, dust &amp; detritus. I can guarantee here that this temperature issue, while possibly not being the cause of the problem, is greatly exasipating it!


<img src="http://boinc.mundayweb.com/one/stats.php?userID=444&amp;trans=off">
ID: 13751 · Report as offensive     Reply Quote
DFellman

Send message
Joined: 3 Mar 05
Posts: 8
Credit: 683,785
RAC: 0
Message 13790 - Posted: 24 Jun 2005, 7:21:08 UTC - in response to Message 13705.  

&gt; 70C is quite hot. What brand motherboard do you have? Some read higher than
&gt; others for the same processor, but no matter what, that is very high.
&gt;

I've given it a good clean out now and is is still running about 60-66 degrees C at a constant 100% CPU, a few degrees cooler when not running CDPN. I'm currently monitoring the CPU temperature.

I have a Gigabyte motherboard (Via KT880), if that makes a difference.

The unrecoverable error was, of course, unrecoverable - My brain must have been on holiday for that post.

Thanks everyone for all your help and suggestions.

Danny
ID: 13790 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 13798 - Posted: 24 Jun 2005, 8:03:19 UTC - in response to Message 13790.  

&gt; I've given it a good clean out now and is is still running about 60-66 degrees
&gt; C at a constant 100% CPU, a few degrees cooler when not running CDPN. I'm
&gt; currently monitoring the CPU temperature.
&gt;
&gt; I have a Gigabyte motherboard (Via KT880), if that makes a difference.
&gt;
Gigabyte's do read hotter than most other motherboards. 60-66C is obviously better than 70C, but still warm, even for a Gigabyte. Try Gareth's suggestion of running with the case open and see if that helps. If it does, you can try to see if you can get better airflow through the case.
ID: 13798 · Report as offensive     Reply Quote
Gareth Lock

Send message
Joined: 2 Sep 04
Posts: 51
Credit: 451,236
RAC: 0
Message 13957 - Posted: 27 Jun 2005, 19:26:58 UTC - in response to Message 13798.  

&gt; &gt; I've given it a good clean out now and is is still running about 60-66
&gt; degrees
&gt; &gt; C at a constant 100% CPU, a few degrees cooler when not running CDPN.
&gt; I'm
&gt; &gt; currently monitoring the CPU temperature.
&gt; &gt;
&gt; &gt; I have a Gigabyte motherboard (Via KT880), if that makes a difference.
&gt; &gt;
&gt; Gigabyte's do read hotter than most other motherboards. 60-66C is obviously
&gt; better than 70C, but still warm, even for a Gigabyte. Try Gareth's suggestion
&gt; of running with the case open and see if that helps. If it does, you can try
&gt; to see if you can get better airflow through the case.
&gt;

My 1900+ is running at about 54-57C at the moment, I consider that on the warm side and indeed, it has locked up on me recently a number of times. If you can, I would consider adding a couple of chassis fans to your setup to try and keep the air inside the case moving. See <a href="http://www.garethlock.shorturl.com/index.htm?boinc">my entry here on keeping your machine cool</a> for more of my tips.


<img src="http://boinc.mundayweb.com/one/stats.php?userID=444&amp;trans=off">
ID: 13957 · Report as offensive     Reply Quote
DFellman

Send message
Joined: 3 Mar 05
Posts: 8
Credit: 683,785
RAC: 0
Message 13970 - Posted: 28 Jun 2005, 9:19:08 UTC - in response to Message 13957.  

&gt; &gt; &gt; I've given it a good clean out now and is is still running about
&gt; 60-66
&gt; &gt; degrees
&gt; &gt; &gt; C at a constant 100% CPU, a few degrees cooler when not running
&gt; CDPN.
&gt; &gt; I'm
&gt; &gt; &gt; currently monitoring the CPU temperature.
&gt; &gt; &gt;
&gt; &gt; &gt; I have a Gigabyte motherboard (Via KT880), if that makes a
&gt; difference.
&gt; &gt; &gt;
&gt; &gt; Gigabyte's do read hotter than most other motherboards. 60-66C is
&gt; obviously
&gt; &gt; better than 70C, but still warm, even for a Gigabyte. Try Gareth's
&gt; suggestion
&gt; &gt; of running with the case open and see if that helps. If it does, you can
&gt; try
&gt; &gt; to see if you can get better airflow through the case.
&gt; &gt;

Thanks,

I currently have the machine running at 61-63 C and opening the case has no effect on the temperature, so I don't think increased air flow will help.

The current CDPN model I have running appears OK, as I am about 6.5% through under BOINC 4.45.

So I guess this thread should be closed.

Thanks again to everyone who has responed and helped me with this problem.

Danny
&gt;
&gt; My 1900+ is running at about 54-57C at the moment, I consider that on the warm
&gt; side and indeed, it has locked up on me recently a number of times. If you
&gt; can, I would consider adding a couple of chassis fans to your setup to try and
&gt; keep the air inside the case moving. See <a> href="http://www.garethlock.shorturl.com/index.htm?boinc"&gt;my entry here on
&gt; keeping your machine cool</a> for more of my tips.
&gt;
&gt;
&gt; <img src="http://boinc.mundayweb.com/one/stats.php?userID=444&amp;trans=off">
&gt;
ID: 13970 · Report as offensive     Reply Quote
Gareth Lock

Send message
Joined: 2 Sep 04
Posts: 51
Credit: 451,236
RAC: 0
Message 13981 - Posted: 28 Jun 2005, 23:06:00 UTC
Last modified: 28 Jun 2005, 23:09:46 UTC

If you're still coming up warm, then try going for a bigger heatsink on your processor. I have a block rated for a 3000+ on my 1900+ CPU and it still ran in the mid 60s today. Even my laptop is running hot!! If you're considering swapping the heatsink, go for an all copper model and use a decent silver-oxide paste rather than the cheap pads.

During the heatwave/sticky weather we are having in the UK at the moment, you need to take precautions against trouble like this and early, especially on a normally unattended BOINC machine.

I can't testify as to the stability of v4.12 hadsm, because, at the moment, my 1900+ is still soldiering away on a previous model for v4.10.




<img src="http://boinc.mundayweb.com/one/stats.php?userID=444&amp;trans=off">
ID: 13981 · Report as offensive     Reply Quote
old_user23880
Volunteer tester

Send message
Joined: 10 Oct 04
Posts: 223
Credit: 4,664
RAC: 0
Message 13982 - Posted: 28 Jun 2005, 23:52:57 UTC

Thanks for all these most useful posts + the link to Gareth's page about keeping cool. This has made me wonder whether my machine's repeated failures with boinc cpdn (I reverted to classic) might not, as I thought, be due to the Athlon's way of doing the calculations, but instead be caused by overheating.

I also have a Gigabyte and I dare not publicly reveal what the temp was when it unexpectedly appeared on the screen. Don't think I'm capable of personally doing anything about it, but I'm keeping note of all of this for the future rebuild........
__________________________________________________

ID: 13982 · Report as offensive     Reply Quote
old_user23880
Volunteer tester

Send message
Joined: 10 Oct 04
Posts: 223
Credit: 4,664
RAC: 0
Message 13983 - Posted: 29 Jun 2005, 0:03:25 UTC

FWIIW Danny, the recent Windows security &amp; other updates for my 2000Pro haven't created any problems that weren't already there.
__________________________________________________

ID: 13983 · Report as offensive     Reply Quote

Questions and Answers : Windows : Hadsm3 4.12 errors after latest Windows 2000 update

©2025 cpdn.org