climateprediction.net (CPDN) home page
Thread 'Computation error'

Thread 'Computation error'

Message boards : Number crunching : Computation error
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user430291

Send message
Joined: 8 Feb 07
Posts: 4
Credit: 544
RAC: 0
Message 26973 - Posted: 21 Feb 2007, 8:34:15 UTC

One of the two tasks of the HadCM3 Coupled Model Experiment 5.15 stopped with the status \'Computation error\'. What must be done in such case ? Abort, restart the task or what else ?

Luigi
ID: 26973 · Report as offensive     Reply Quote
old_user170894
Avatar

Send message
Joined: 3 Mar 06
Posts: 96
Credit: 353,185
RAC: 0
Message 26979 - Posted: 21 Feb 2007, 17:59:37 UTC - in response to Message 26973.  
Last modified: 21 Feb 2007, 18:03:32 UTC

One of the two tasks of the HadCM3 Coupled Model Experiment 5.15 stopped with the status \'Computation error\'. What must be done in such case ? Abort, restart the task or what else ?

Luigi


In your log files or messages, can you find the error code (number) that accompanied the Computation Error? The error code gives more information than Computation Error and helps determine the cause and the cure.

I see 2 other models (tasks) crashed on you on Feb. 8 and Feb. 13. It seems you have a problem that needs to be fixed before you restart or try a new model.

I also see you have a P4 and it looks like you are running 2 models simultaneously with Hyper Threading. Is that correct? Maybe you could try running just 1 model at a time instead of 2? If you now have just 1 model running then I suggest leave it that way for now, do not try to restart the crashed model.

Somebody with more experience will come along soon and offer more advice. But if you could find and report the precise error code for Computation Error it would be helpful.

P.S. Do you have a backup of the BOINC directory?


ID: 26979 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 26981 - Posted: 21 Feb 2007, 20:37:51 UTC

Luigi,
The errors were both for -107... That often indicates an issue with graphics. For assistance see here:
Mike\'s post suggests ways to avoid crashes (Solutions to models crashing: -161 error, or -1073741819 (0xc0000005)):
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=4231
Les\' comments for Exit Code -1 and -107... here:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=4710#23372
Thyme Lawn\'s Testing Graphics Compatability & driver update:
http://bbc.cpdn.org/forum_thread.php?id=1038&nowrap=true#3977

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 26981 · Report as offensive     Reply Quote
old_user430291

Send message
Joined: 8 Feb 07
Posts: 4
Credit: 544
RAC: 0
Message 27121 - Posted: 28 Feb 2007, 9:27:53 UTC - in response to Message 26979.  

Hi,
I did not find the log files. I tried to find something that looked like a log file but there is a lot of stuff. Maybe you can tell me where they are starting from the path BOINC\\projects\\climateprediction.net (I suppose the log file are here somewhere). Now there is only one model running and I will not restard the crashed model. If it helps I can make a zip file of the BOINC directory. Thanks for your support.

Luigi


In your log files or messages, can you find the error code (number) that accompanied the Computation Error? The error code gives more information than Computation Error and helps determine the cause and the cure.

I see 2 other models (tasks) crashed on you on Feb. 8 and Feb. 13. It seems you have a problem that needs to be fixed before you restart or try a new model.

I also see you have a P4 and it looks like you are running 2 models simultaneously with Hyper Threading. Is that correct? Maybe you could try running just 1 model at a time instead of 2? If you now have just 1 model running then I suggest leave it that way for now, do not try to restart the crashed model.

Somebody with more experience will come along soon and offer more advice. But if you could find and report the precise error code for Computation Error it would be helpful.

P.S. Do you have a backup of the BOINC directory?

[/quote]

ID: 27121 · Report as offensive     Reply Quote
old_user430291

Send message
Joined: 8 Feb 07
Posts: 4
Credit: 544
RAC: 0
Message 27122 - Posted: 28 Feb 2007, 9:36:42 UTC - in response to Message 26981.  

Hi,
it could be. The only thing I can see is that the \'show graphics\' command in the BOINC manager is disabled for the crashed task.

Luigi


Luigi,
The errors were both for -107... That often indicates an issue with graphics. For assistance see here:
Mike\'s post suggests ways to avoid crashes (Solutions to models crashing: -161 error, or -1073741819 (0xc0000005)):
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=4231
Les\' comments for Exit Code -1 and -107... here:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=4710#23372
Thyme Lawn\'s Testing Graphics Compatability & driver update:
http://bbc.cpdn.org/forum_thread.php?id=1038&nowrap=true#3977

ID: 27122 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 27129 - Posted: 28 Feb 2007, 14:24:33 UTC
Last modified: 28 Feb 2007, 14:34:35 UTC

Luigi, I think that if you read the 3 links that Astro gave you, that should be sufficient to avoid more -107 errors in future.

You\'ve had the 2 crashed models. These 2 models are #3 and #4 on the list here. They will have disappeared from your boinc manager, which is the correct thing. When a model crashes, it disappears and the server sends you a new model to replace it. You have had 2 crashed models, so you\'ve had 2 replacements.

There are also 2 current (replacement) models listed as \'in progress\' on your computer (the first 2 in the list)

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=538342

but only one of these two models is apparently running. Can you see in your boinc manager that one model is running? Can you see its graphics globe?

Are there 2 models in the projects section of your boinc manager?

On the assumption that there are two models, are you sure you haven\'t suspended the model that isn\'t running? And in your project preferences, check that you are allowing the computer to use 2 cores, not one. (I am assuming you want to run 2 models simultaneously.)




Cpdn news
ID: 27129 · Report as offensive     Reply Quote
old_user430291

Send message
Joined: 8 Feb 07
Posts: 4
Credit: 544
RAC: 0
Message 27131 - Posted: 28 Feb 2007, 17:10:59 UTC - in response to Message 27129.  

Hi,
the running model is the one referenced by the Result ID 6403399. The other one is not running and its status is \"ready to report\". There is one model only in the project section. I can see the graphics for only one task, the running one, the other one has the command disabled. I didn\'t figured out how to set boinc to use the two cores from the advanced or simple view.




Luigi, I think that if you read the 3 links that Astro gave you, that should be sufficient to avoid more -107 errors in future.

You\'ve had the 2 crashed models. These 2 models are #3 and #4 on the list here. They will have disappeared from your boinc manager, which is the correct thing. When a model crashes, it disappears and the server sends you a new model to replace it. You have had 2 crashed models, so you\'ve had 2 replacements.

There are also 2 current (replacement) models listed as \'in progress\' on your computer (the first 2 in the list)

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=538342

but only one of these two models is apparently running. Can you see in your boinc manager that one model is running? Can you see its graphics globe?

Are there 2 models in the projects section of your boinc manager?

On the assumption that there are two models, are you sure you haven\'t suspended the model that isn\'t running? And in your project preferences, check that you are allowing the computer to use 2 cores, not one. (I am assuming you want to run 2 models simultaneously.)




ID: 27131 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 27135 - Posted: 28 Feb 2007, 18:59:29 UTC

\"Ready to report\" means that the model is finished, either completed or crashed.
A \"Finished\" message will be uploaded to the server about 2 hours after the other data.

The prefs for selecting the number of processors is in your Account page on the server, not on your computer.

ID: 27135 · Report as offensive     Reply Quote

Message boards : Number crunching : Computation error

©2024 cpdn.org