Message boards : Number crunching : Model crashed: REPLANCA: Current time precedes start time of data
Message board moderation
Author | Message |
---|---|
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Is this a new problem? My computer's time is correct, if that matters. UK Met Office HADAM3P European Region v6.09 Stderr show hide <core_client_version>6.10.45</core_client_version> <![CDATA[ <stderr_txt> Model crashed: REPLANCA: Current time precedes start time of data tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_eu_c1fl_1997_1_008565930_0_1.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_c1fl_1997_1_008565930_0_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> [etc.] |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,005,674 RAC: 21,647 |
Replanca error has certainly been discussed in the past. It is almost certainly a model problem rather than anything to do with your computer. I can't remember if it was OS dependent or not. I see that I have just downloaded some of these tasks and given that I am also a windows free zone I do not feel optimistic. I can't remember whether it was here or on the other now defunct board that I saw it discussed. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
Thanks, both. Passed onto project staff. Since these models are marked '1997' rather than '2013' I don't know whether they're part of the flood analysis or something else entirely. Will report back when more information becomes available. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,005,674 RAC: 21,647 |
Thanks Ian, I see that some other tasks in some of the work units have failed with the replanca error since I last looked including some running on windows boxes so at least on this occasion it is not OS dependant. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
I had four 1997 EUR models downloaded on 17 March. All crashed at exactly the same moment (1m21s) with REPLANCA. As soon as they'd started I tried to open the graphics globe of one of them to see whether there was anything of interest. The attempt to open the graphics window failed (the window was just an outline filled with black) but it had a terrible effect on BOINC Manager. All visible progress by other normal tasks in the Tasks pane froze and then the whole Tasks pane (or maybe the whole of BM, I can't remember) greyed out. After the four tasks had crashed, BM returned to normal and the graphics window showed a blue globe with zero crunching recorded. REPLANCA gives BOINC a bit of a fright. Cpdn news |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,005,674 RAC: 21,647 |
Which leads to the question, should I just delete the two 1997 models I have downloaded now before they start? This would give my computer a chance of picking up anything else going. Alternatively I could briefly suspend the 2014 models I have running to confirm they crash out. I did the latter and they duly crashed. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
For anyone reading this with 1997 EUR models created on 17 March 2014 it would be worth doing the same as Dave: force them to run immediately by suspending other tasks. As they will almost certainly crash after a few seconds you will then be more likely to receive work when the next good batch of models appears. Cpdn news |
Send message Joined: 22 Mar 06 Posts: 144 Credit: 24,695,428 RAC: 0 |
I've just had 8 of these crash on 1290283. See they have been passed onto other PCs - presumably to crash again. Win764bit on my machine so looks like model error. Shame as things were going quite well. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
At least these models aren't consuming much time or electricity. When we see crash reasons for climate models spelled out in capital letters - REPLANCA, not Replanca or replanca - I now assume this follows some Met Office convention (they wrote the code) and means the cause is probably intrinsic to the model. Not that all model defects produce an explanation in upper case. They don't. One of my models waiting to be crunched from a completely different batch crashed on another computer with the stderr line: Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO I've no idea what this means but I'm assuming that the capital letters indicate a model with a different intrinsic defect. Cpdn news |
Send message Joined: 21 Oct 10 Posts: 53 Credit: 2,101,753 RAC: 3,985 |
I got the same issue on a win7 machine with 2 WU (this and that), too bad since the computer where it happened only runs 2 CPDN WU and has no access to Internet, forced to play with a USB key and an old portable version of boinc to have it running, and considering how difficult it is to get some WUs lately, pfff... |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
REPLANCA is the name of a Fortran function (or should I say FORTRAN) that deals with times. The kind of thing that sends it into a spin is running a 360 days per year model as if it's 365 days per year. I don't suppose that's the case here but there has been some model configuration error somewhere. Unless the project team find a way of neatly pulling that batch then we just have to do it the messy way. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Hi folks, I'm using this old REPLANCA thread to report a pnw25 model batch 757 that gave the following error on 2 machines and the 3rd attempt will be on one of mines WIN7. Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xadae.pipe_dummy Leaving CPDN_ain::Monitor... 04:33:59 (37580): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>wah2_pnw25_puy7_206809_28_757_011648678_1_r1093481325_28.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> <file_xfer_error> <file_name>wah2_pnw25_puy7_206809_28_757_011648678_1_r1093481325_restart.zip</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]> Is this a problem with the batch or normal model crash? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
That's a known problem with that batch. Some list ran out before the others. Ooops. But not for all of the batch. I emailed them on the 14th when I had it. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Should I abort then or let it try? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
You may as well Abort it. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,005,674 RAC: 21,647 |
I have one from that batch that is on its first try so will let it run and see what happens especially as there doesn't seem to be any new work at the moment. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
For batches 754 and 757 I've had more successes than failures (all failures being REPLANCA errors): batch 754 = 7/9 and batch 757 = 3/5. So I am continuing to run them. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,005,674 RAC: 21,647 |
One of my 757s has failed once as has my 754 so I will check to see if they were with replanca error. Edit:All the second/third run ones are seg faults so I will let them run. |
©2024 cpdn.org