Questions and Answers : Windows : Crashes and \'just not running properly\' problems.
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 Jul 05 Posts: 11 Credit: 88,836 RAC: 0 |
First of all I should point out that while I am more than happy to use my PC to support this project, I have not taken the trouble to delve into its inner depths. For most of the past year since setting up BOINC I have been running two projects - this one and SETI - and both appeared to run without any operational intervention for nearly all that time. About 5 weeks ago I started getting crash reports from hadcm3lbm programs - but at that time I let them reset and did not take too much interest. Howevere, about three or four days ago I decided to investigate further and noted that the statistics showed Climatepredition activity on only about 7 of the past 40 days. Whereas SETI seems to have been running all the time. About 5 days ago I used the Reset Project button, but that does not seem to have had much effect. My message logs are cleared each time my machine is re-booted, so I only have a short history. Can someone suggest a course of action to either track down and remedy the problem, or worst case, to write off any of the data I might have, re-install the project and start again. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The new Coupled Ocean models need a 5.* version of BOINC to work. They upload trickles once a model year, and data every 10 model years. And there are no data files left behind when the model completes. But if it crashes, there WILL be some residue, which must be manually deleted. There is a continuous history of messages in stdoutdae.txt, which is in the BOINC folder. |
Send message Joined: 19 Jul 05 Posts: 11 Credit: 88,836 RAC: 0 |
I have a number of hadcm3lbm sub-folders in my climateprediction.net folder. All but one are empty. I also have a sulphur_hht folder that also contains files. Would those contain the residual files I should delete? Thanks. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes. |
Send message Joined: 19 Jul 05 Posts: 11 Credit: 88,836 RAC: 0 |
All seemed to be going well, inst\\lled Version 5.4.9 of BOINC. Clmate change software downloaded and then... 20/07/2006 08:25:05|climateprediction.net|Unrecoverable error for result hadcm3lbm_bhge_25305105_0 ( - exit code -1073741819 (0xc0000005)) 20/07/2006 08:25:05|climateprediction.net|Deferring scheduler requests for 1 minutes and 0 seconds 20/07/2006 08:25:05||Rescheduling CPU: application exited 20/07/2006 08:25:05|climateprediction.net|Computation for task hadcm3lbm_bhge_25305105_0 finished 20/07/2006 08:26:07|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 20/07/2006 08:26:07|climateprediction.net|Reason: To fetch work 20/07/2006 08:26:07|climateprediction.net|Requesting 8640 seconds of new work, and reporting 1 completed tasks 20/07/2006 08:26:17|climateprediction.net|Scheduler request succeeded 20/07/2006 08:26:17|climateprediction.net|Message from server: No work sent 20/07/2006 08:26:17|climateprediction.net|Message from server: (reached daily quota of 1 results) 20/07/2006 08:26:17|climateprediction.net|No work from project The climate Task has disappeared (it was there very briefly). Any (polite!) suggestions on what to do next? |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Hi Alan, The topmost sticky in this forum discusses this prolem (\"Solutions to models crashing ...\") in more detail. There are several suggestions, the main being to see if there are any updates for your graphics card available. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 19 Jul 05 Posts: 11 Credit: 88,836 RAC: 0 |
Hi Mike, Les and all our readers... This is getting rather time consuming, but I have tried to be systematic. A browse of the stdoutdae.txt file shows that apart from an isolated one off incident in May the file is free of references to 1073741819 until 4 July. Since then it has been a daily occurance. Generally the reports are logged at between 06:00 and 09:00, a time my machine is generally \'idle\'. Ref the Sticky: * Send/Don\'t send. By the time I see this message the BOINC icon is missing from the System Tray, so even if it is still running I am not sure how I can close it in a cotrolled way. Any ideas? * Anti-virus: I use ZoneAlarm System Suite. There may be problems here of course, but nothing obvious so far. * Games. I\'m very boring and only play the odd game of solitair. However at 6 in the morning the machine is usually just ticking over. I do have a scheduled virus scan, but that kicks in at about 01:00 and takes about an hour. * Stability Test: Possible I suppose, but it looks as though the test will take my machine out of service for 24 hrs, and so that will be the last resort. * Overheating: I have installed the monitor suggested elsewhere on these boards, and it would appear that the AMD processor is running at around 30 deg C, this is the current ambiant temperature and well within the 50 deg upper limit quoted. * Overclocking. This is an off-the-shelf Mesh PC with an AMD 64 3500+ processor. Unless MESH set it up to be overclocked, that should not be a problem. * Backups - so far as I can tell nothing to back up. The folders that are created from the downloads are empty by the time I get to look at them post crash. * Firewall Messages: Have checked the Logs, nothing obvious. * Windows \'time sync\': Not seen any messages. * The benchmark: Log files have different mesages to those reported here. * The Memory requirement: I have 1M RAM and around 50% of that is free when idle. * Graphics Drivers: Using the tests suggested, my All-in-wonder 9800 series system seems to be performing OK. Also confirmed DirectX is up to date. However, Have just run a driver update routine and it does appear to have made some changes. I haven\'t rebooted my machine yet - decided to send this first as you may never hear from me again :-) So apart from the drivers, and the possibility that ZoneAlarm may also be a source of problems, most of the options are covered. Two significant points. The \'reached daily quota of 1 results\' message implies I only get one bite of the cherry each day to see if I have found a fix. Is there any way this coan be overridden? What is the role of the hadcm3trans module that appears tobe giving me problems? Its name suggests it is transmitting results rather than calculating or displaying stuff, so would not be employing graphics or lots of processing. If so, it would appear the problem will be with I/O, communicating with the server, etc. and firewall or network problems would be the most likely root of the problem. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Hi Alan, Nice to see someone being as systematic as you :-) I use ZoneAlarm Security Suite myself, and get no problems for it (but I have added the c:\\program files\\boinc directory into the antivirus and antispyware exclusion list). hadcm3trans is the \'controlling\' task, so if anything happens to it, it\'s usually bad news. It stops and starts the worker process, and also does graphics rendering. If we\'re lucky, the driver update may fix the problems, but since there are many possible causes of problems, it\'s hard to say for sure. The \'one result per day\' limit can be overridden by detaching and reattaching to the project. Do you have an account on the www.climateprediction.net forum (which requires separate registration)? Your machine in it\'s current state is potentially quite valuable to the project, so I\'d like to send you a PM (which isn\'t possible on this board). I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Just to clarify the above post : I meant the \'discussion boards\' at http://www.climateprediction.net/board/index.php rather than these ones :-) I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 19 Jul 05 Posts: 11 Credit: 88,836 RAC: 0 |
Hi, A small change overnight following the installation of the new drivers. Although the log shows the same error code 22/07/2006 01:52:21|climateprediction.net|Unrecoverable error for result hadcm3lbm_91k1_25191188_0 ( - exit code -1073741819 (0xc0000005)) it was for a different module, at a substantially different time, and unless it cleared on its own, there was no associated send/don\'t send Windows crash report. So I\'m now registered with the other forum so let\'s see what we can learn from this. Alan |
©2025 cpdn.org