climateprediction.net (CPDN) home page
Thread 'Crashes and \'just not running properly\' problems.'

Thread 'Crashes and \'just not running properly\' problems.'

Questions and Answers : Windows : Crashes and \'just not running properly\' problems.
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user88740

Send message
Joined: 19 Jul 05
Posts: 11
Credit: 88,836
RAC: 0
Message 23660 - Posted: 19 Jul 2006, 6:53:34 UTC

First of all I should point out that while I am more than happy to use my PC to support this project, I have not taken the trouble to delve into its inner depths.

For most of the past year since setting up BOINC I have been running two projects - this one and SETI - and both appeared to run without any operational intervention for nearly all that time.

About 5 weeks ago I started getting crash reports from hadcm3lbm programs - but at that time I let them reset and did not take too much interest.

Howevere, about three or four days ago I decided to investigate further and noted that the statistics showed Climatepredition activity on only about 7 of the past 40 days. Whereas SETI seems to have been running all the time.

About 5 days ago I used the Reset Project button, but that does not seem to have had much effect. My message logs are cleared each time my machine is re-booted, so I only have a short history.

Can someone suggest a course of action to either track down and remedy the problem, or worst case, to write off any of the data I might have, re-install the project and start again.
ID: 23660 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 23661 - Posted: 19 Jul 2006, 7:34:15 UTC

The new Coupled Ocean models need a 5.* version of BOINC to work.
They upload trickles once a model year, and data every 10 model years. And there are no data files left behind when the model completes.
But if it crashes, there WILL be some residue, which must be manually deleted.

There is a continuous history of messages in stdoutdae.txt, which is in the BOINC folder.

ID: 23661 · Report as offensive     Reply Quote
old_user88740

Send message
Joined: 19 Jul 05
Posts: 11
Credit: 88,836
RAC: 0
Message 23664 - Posted: 19 Jul 2006, 18:15:24 UTC - in response to Message 23661.  

I have a number of hadcm3lbm sub-folders in my climateprediction.net folder. All but one are empty. I also have a sulphur_hht folder that also contains files.

Would those contain the residual files I should delete?

Thanks.
ID: 23664 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 23666 - Posted: 19 Jul 2006, 21:03:25 UTC

Yes.

ID: 23666 · Report as offensive     Reply Quote
old_user88740

Send message
Joined: 19 Jul 05
Posts: 11
Credit: 88,836
RAC: 0
Message 23673 - Posted: 20 Jul 2006, 7:34:21 UTC - in response to Message 23666.  

All seemed to be going well, inst\\lled Version 5.4.9 of BOINC.
Clmate change software downloaded and then...

20/07/2006 08:25:05|climateprediction.net|Unrecoverable error for result hadcm3lbm_bhge_25305105_0 ( - exit code -1073741819 (0xc0000005))
20/07/2006 08:25:05|climateprediction.net|Deferring scheduler requests for 1 minutes and 0 seconds
20/07/2006 08:25:05||Rescheduling CPU: application exited
20/07/2006 08:25:05|climateprediction.net|Computation for task hadcm3lbm_bhge_25305105_0 finished
20/07/2006 08:26:07|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
20/07/2006 08:26:07|climateprediction.net|Reason: To fetch work
20/07/2006 08:26:07|climateprediction.net|Requesting 8640 seconds of new work, and reporting 1 completed tasks
20/07/2006 08:26:17|climateprediction.net|Scheduler request succeeded
20/07/2006 08:26:17|climateprediction.net|Message from server: No work sent
20/07/2006 08:26:17|climateprediction.net|Message from server: (reached daily quota of 1 results)
20/07/2006 08:26:17|climateprediction.net|No work from project

The climate Task has disappeared (it was there very briefly). Any (polite!) suggestions on what to do next?


ID: 23673 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 23681 - Posted: 20 Jul 2006, 19:02:48 UTC

Hi Alan,

The topmost sticky in this forum discusses this prolem (\"Solutions to models crashing ...\") in more detail. There are several suggestions, the main being to see if there are any updates for your graphics card available.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 23681 · Report as offensive     Reply Quote
old_user88740

Send message
Joined: 19 Jul 05
Posts: 11
Credit: 88,836
RAC: 0
Message 23685 - Posted: 21 Jul 2006, 9:23:33 UTC - in response to Message 23681.  

Hi Mike, Les and all our readers...

This is getting rather time consuming, but I have tried to be systematic.

A browse of the stdoutdae.txt file shows that apart from an isolated one off incident in May the file is free of references to 1073741819 until 4 July. Since then it has been a daily occurance. Generally the reports are logged at between 06:00 and 09:00, a time my machine is generally \'idle\'.

Ref the Sticky:

* Send/Don\'t send. By the time I see this message the BOINC icon is missing from the System Tray, so even if it is still running I am not sure how I can close it in a cotrolled way. Any ideas?

* Anti-virus: I use ZoneAlarm System Suite. There may be problems here of course, but nothing obvious so far.

* Games. I\'m very boring and only play the odd game of solitair. However at 6 in the morning the machine is usually just ticking over. I do have a scheduled virus scan, but that kicks in at about 01:00 and takes about an hour.

* Stability Test: Possible I suppose, but it looks as though the test will take my machine out of service for 24 hrs, and so that will be the last resort.

* Overheating: I have installed the monitor suggested elsewhere on these boards, and it would appear that the AMD processor is running at around 30 deg C, this is the current ambiant temperature and well within the 50 deg upper limit quoted.

* Overclocking. This is an off-the-shelf Mesh PC with an AMD 64 3500+ processor. Unless MESH set it up to be overclocked, that should not be a problem.

* Backups - so far as I can tell nothing to back up. The folders that are created from the downloads are empty by the time I get to look at them post crash.

* Firewall Messages: Have checked the Logs, nothing obvious.

* Windows \'time sync\': Not seen any messages.

* The benchmark: Log files have different mesages to those reported here.

* The Memory requirement: I have 1M RAM and around 50% of that is free when idle.

* Graphics Drivers: Using the tests suggested, my All-in-wonder 9800 series system seems to be performing OK. Also confirmed DirectX is up to date. However, Have just run a driver update routine and it does appear to have made some changes. I haven\'t rebooted my machine yet - decided to send this first as you may never hear from me again :-)

So apart from the drivers, and the possibility that ZoneAlarm may also be a source of problems, most of the options are covered.

Two significant points. The \'reached daily quota of 1 results\' message implies I only get one bite of the cherry each day to see if I have found a fix. Is there any way this coan be overridden?

What is the role of the hadcm3trans module that appears tobe giving me problems? Its name suggests it is transmitting results rather than calculating or displaying stuff, so would not be employing graphics or lots of processing. If so, it would appear the problem will be with I/O, communicating with the server, etc. and firewall or network problems would be the most likely root of the problem.

ID: 23685 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 23686 - Posted: 21 Jul 2006, 19:33:21 UTC
Last modified: 21 Jul 2006, 19:35:48 UTC

Hi Alan,

Nice to see someone being as systematic as you :-)

I use ZoneAlarm Security Suite myself, and get no problems for it (but I have added the c:\\program files\\boinc directory into the antivirus and antispyware exclusion list).

hadcm3trans is the \'controlling\' task, so if anything happens to it, it\'s usually bad news. It stops and starts the worker process, and also does graphics rendering. If we\'re lucky, the driver update may fix the problems, but since there are many possible causes of problems, it\'s hard to say for sure.

The \'one result per day\' limit can be overridden by detaching and reattaching to the project.

Do you have an account on the www.climateprediction.net forum (which requires separate registration)? Your machine in it\'s current state is potentially quite valuable to the project, so I\'d like to send you a PM (which isn\'t possible on this board).
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 23686 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 23694 - Posted: 21 Jul 2006, 22:22:22 UTC

Just to clarify the above post : I meant the \'discussion boards\' at http://www.climateprediction.net/board/index.php rather than these ones :-)
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 23694 · Report as offensive     Reply Quote
old_user88740

Send message
Joined: 19 Jul 05
Posts: 11
Credit: 88,836
RAC: 0
Message 23700 - Posted: 22 Jul 2006, 8:09:00 UTC - in response to Message 23694.  
Last modified: 22 Jul 2006, 8:09:35 UTC

Hi,
A small change overnight following the installation of the new drivers. Although the log shows the same error code

22/07/2006 01:52:21|climateprediction.net|Unrecoverable error for result hadcm3lbm_91k1_25191188_0 ( - exit code -1073741819 (0xc0000005))

it was for a different module, at a substantially different time, and unless it cleared on its own, there was no associated send/don\'t send Windows crash report.

So I\'m now registered with the other forum so let\'s see what we can learn from this.

Alan
ID: 23700 · Report as offensive     Reply Quote

Questions and Answers : Windows : Crashes and \'just not running properly\' problems.

©2025 cpdn.org