climateprediction.net (CPDN) home page
Thread 'My model just crashed saying \'output file...absent\'. Any idea what caused that?'

Thread 'My model just crashed saying \'output file...absent\'. Any idea what caused that?'

Questions and Answers : Windows : My model just crashed saying \'output file...absent\'. Any idea what caused that?
Message board moderation

To post messages, you must log in.

AuthorMessage
Digby

Send message
Joined: 17 Feb 06
Posts: 89
Credit: 4,309,159
RAC: 0
Message 28499 - Posted: 7 May 2007, 8:58:45 UTC

Oh well, I was too confident and I left network communication open thinking everything was great, so now the server has cancelled my 44% completed model.

Any ideas what caused the problem? I have another model that us 55% and still running fine on the same machine.

06/05/2007 19:27:56|climateprediction.net|Sending scheduler request: To send trickle-up message
06/05/2007 19:27:56|climateprediction.net|(not requesting new work or reporting completed tasks)
06/05/2007 19:28:01|climateprediction.net|Scheduler RPC succeeded [server version 509]
06/05/2007 21:03:17|climateprediction.net|Sending scheduler request: To send trickle-up message
06/05/2007 21:03:17|climateprediction.net|(not requesting new work or reporting completed tasks)
06/05/2007 21:03:22|climateprediction.net|Scheduler RPC succeeded [server version 509]
06/05/2007 21:56:25|climateprediction.net|Computation for task hadcm3pbb_ccem_05850025_0 finished
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_8.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_9.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_10.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_11.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_12.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_13.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_14.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_15.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:25|climateprediction.net|Output file hadcm3pbb_ccem_05850025_0_16.zip for task hadcm3pbb_ccem_05850025_0 absent
06/05/2007 21:56:26|climateprediction.net|Deferring communication for 1 min 0 sec
06/05/2007 21:56:26|climateprediction.net|Reason: Unrecoverable error for result hadcm3pbb_ccem_05850025_0 (<file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_8.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_9.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_10.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_11.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_12.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_13.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_14.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3pbb_ccem_05850025_0_15.zip</file_
07/05/2007 00:20:28|climateprediction.net|Sending scheduler request: To report completed tasks
07/05/2007 00:20:28|climateprediction.net|Reporting 1 tasks
07/05/2007 00:20:33|climateprediction.net|Scheduler RPC succeeded [server version 509]
07/05/2007 08:44:45|climateprediction.net|Sending scheduler request: To send trickle-up message
07/05/2007 08:44:45|climateprediction.net|(not requesting new work or reporting completed tasks)
07/05/2007 08:44:50|climateprediction.net|Scheduler RPC succeeded [server version 509]

I have a backup from 40% so I could restore but the server would not be able to recognise it.

Cheers

Digby
ID: 28499 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 28503 - Posted: 7 May 2007, 9:52:00 UTC

If you look at the error list on the page for that model, you\'ll see this:
Model crashed: umshell1.f: ATM_DYN : NEGATIVE THETA DETECTED.

In other words, that combination of starting values is non-viable for producing a long run.
Which is bad luck for you, but good news for the researchers, as they now know another set that doesn\'t work.

ID: 28503 · Report as offensive     Reply Quote
Digby

Send message
Joined: 17 Feb 06
Posts: 89
Credit: 4,309,159
RAC: 0
Message 28513 - Posted: 7 May 2007, 14:56:01 UTC - in response to Message 28503.  

OK I had a look and now you have explained it I can see the message that you are referring to. My model was actually showing a Global Mean Air Temp of 17.4 deg in 1991, which was a little on the high side. So I presume the model just said \'this is not realistic because we know it was not 17.4 deg in 1991 so I am going to stop\'.

This is my the second model that aborted due to the parameters not being correct and I am hopeful that the project can capture these parameter settings and learn which one work and which don\'t.

Is there anything else I shoud do before clicking \'Allow New Projects\' and downloading another one?

Thanks

ID: 28513 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 28517 - Posted: 7 May 2007, 17:55:14 UTC

Should be good to go. (Upload Server is \"full\"; don\'t know how that might affect downloads.)

Better luck with the new Model. If it\'s any consolation, the \"impossible situation\" failure give good information for the Project team --> and many of us have them. (I had eight Negative Pressure failures across five machines, all but one with 5.15 series Models. One from the new series...)

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 28517 · Report as offensive     Reply Quote
Digby

Send message
Joined: 17 Feb 06
Posts: 89
Credit: 4,309,159
RAC: 0
Message 28531 - Posted: 8 May 2007, 8:16:12 UTC - in response to Message 28517.  

OK Thanks.
ID: 28531 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 28533 - Posted: 8 May 2007, 10:02:07 UTC

Keep making backups. Even if the model has a fatal crash (as long as it isn\'t something wrong within the model itself), you can restore the backup and continue with the same model. The server will recognise your model.
Cpdn news
ID: 28533 · Report as offensive     Reply Quote
Digby

Send message
Joined: 17 Feb 06
Posts: 89
Credit: 4,309,159
RAC: 0
Message 28538 - Posted: 8 May 2007, 21:25:04 UTC - in response to Message 28533.  
Last modified: 8 May 2007, 21:25:51 UTC

Not this time Mo...because Les replied earlier and said that the model crashed because the starting values were non-viable...

[quote]If you look at the error list on the page for that model, you\'ll see this:
Model crashed: umshell1.f: ATM_DYN : NEGATIVE THETA DETECTED.

In other words, that combination of starting values is non-viable for producing a long run.
Which is bad luck for you, but good news for the researchers, as they now know another set that doesn\'t work. [quote]

So this morning I downloaded another model. It is different this time \'HadCM3 Couple Model Experiment Optimised File I/O 5.40\'. It actually takes up 100MB of memory...

Also my other model is trying to upload a ZIP file but the server will not take it. Do you know when they will sort the server out...this is looking a bit amateurish...I am not used to working with science projects and I suppose I\'ll be told that the budget is tight.

Thanks anyway

Digby
ID: 28538 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 28539 - Posted: 8 May 2007, 21:32:54 UTC
Last modified: 8 May 2007, 21:42:53 UTC

Yes, I realised that - it\'s what I meant when I said \'as long as it isn\'t something wrong within the model itself\'. I meant that backups might prove useful in the future.

It\'s a nuisance when models produce these values that are impossible in the real world. But I suppose that\'s what research is like. Not every result will easily fit into a clear pattern.

Best of luck with your new model.

Yes, there have been server problems since Friday. Don\'t let the zip files keep trying to upload - just suspend network activity for another day or two. When the server problem is fixed we\'ll announce it in the News thread. Get to it through my signature, or it\'s the top thread in the Number crunching section of this forum.

Data storage is a problem for more than one boinc project.

Edit - Les was lightning-quick again!
Cpdn news
ID: 28539 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 28540 - Posted: 8 May 2007, 21:33:00 UTC


The server issue is being discussed on the News thread on the Number crunching board.

The new server arrived, but a lot of other people were also in the computer room working on their equipment, so it had to wait a while. Right now the software is slowly building the raid5 system on the new server.
I don\'t know if this is amateurish or not, but life isn\'t always plain sailing.

ID: 28540 · Report as offensive     Reply Quote

Questions and Answers : Windows : My model just crashed saying \'output file...absent\'. Any idea what caused that?

©2025 cpdn.org