Questions and Answers :
Wish list :
Visual Fortran Run-Time Error FAQ/Fix
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 Mar 05 Posts: 8 Credit: 643,760 RAC: 424 |
I just reset the project each time it does this. I suspect other users just dump the project or shrug. This relationship is supposed to be lightweight on our part. I read this is i/o errors, access conflicts, boinc, time sync, etc., some postings explaining log messages that are not related. I wish you\'d write a FAQ, if not a fix! |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi David I saw your similar post yesterday at the end of a long thread about this type of error message and looked at the models you\'ve had and what error codes and Run log text the crashed models had generated. (Click the + on each model\'s web page record to see the text for finished models. Sometimes plenty of text is generated, sometimes very little.) It may be called stderrout, not Run log. You have a lot of models that crashed with error code 22 and short text indicating a data error, or longer text saying \'The device does not recognize the command\'. The \'device\' means your own hard disk and I think that in all these cases the model failed to write its data to your disk. Here\'s one example: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7868042 I\'m not sure whether you aborted every model that got a Visual Fortran Runtime Error message in BOINC manager, or whether you let some of them continue and they later crashed with the device + command error. Some people\'s models have received the BOINC manager Fortran message, but then continued successfully. I don\'t think this message is necessarily a reason to abort the model. Could other members please comment on this? As far as the code 22 crashed models are concerned, it\'s unfortunately a generic error code encompassing many possible crash scenarios. The best advice is for you to look at the CPDN README collections, to be found on the CPDN independent forum here near the top: http://www.climateprediction.net/board/index.php Separate registration is required there but everyone can read the forum. There\'s a README collection specifically about crashes and problems. I suggest looking at link #6 in that collection, written by MikeMars. It lists ways of avoiding all the common causes of model crashes including 22 errors. Several models you crunched a while ago crashed with a 107 or -107 code. Both are closely related. They could mean you sometimes turned off the computer without fully exiting from BOINC (you need to right-click on the BOINC icon and select Exit, or in BOINC manager File > Exit). Or you could get these error codes through running the CPDN screensaver, which causes graphics and memory problems on some computers. Best to disable the screensaver. Any of these codes could indicate that your computer is unstably overclocked. Or that your computer needs a graphics card driver update which is free of charge on the web. Post #7 by Thyme Lawn in the same README collection about crashes and problems explains how to get these updates. (I think I need to edit the links to these READMEs in my signature!) It\'s not usually a good idea to abort a model unless you see that it isn\'t processing correctly. Many of us regularly back up the complete contents of the BOINC folder so that if a model does crash we can restore the backup and continue the model from there. There\'s another README collection about how to make these backups. I just use Les Bayliss\'s easy manual backup/restore methods which work perfectly. The READMEs are much more complete and up-to-date than the CPDN FAQs. I hope these suggestions are useful. Cpdn news |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
[mo.v wrote:]... Some people\'s models have received the BOINC manager Fortran message, but then continued successfully. I don\'t think this message is necessarily a reason to abort the model. Could other members please comment on this? ...The occasions when I have had these errors are when a non-BOINC process has taken 100% of the CPU for an extended period: a printer driver one time, a utility application another. Both times the other process was killed, the machine re-booted, and the CPDN model carried on without a problem. |
©2024 cpdn.org