Questions and Answers : Unix/Linux : CPDN monitor got quit request (resurrected)
Message board moderation
Author | Message |
---|---|
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
This is an old issue and not generally fatal. Nonetheless, one hopes it is on the To Do List to be fixed. Note the H:M:S time before & after re-start. How did it finesse three minutes into 5+ hours? (This is an issue that I thought WAS fixed.) [Edit: This is from the second set of 4.12 WU, on 4.19, P4 2.8 HT {running parallel with a Sulfur Model}, SuSE 9.0.] 1v12_300107757 - PH 1 TS 000067 - 02/12/1810 09:30 - H:M:S=0000:02:59 AVG= 2.68 DLT= 0.97 CPDN Monitor got quit request... Detaching shared memory... 2005-04-01 14:00:30 [climateprediction.net] Result 1v12_300107757_0 exited with zero status but no 'finished' file 2005-04-01 14:00:30 [climateprediction.net] If this happens repeatedly you may need to reset the project. Starting model in /home/jim/CPDNboinc/projects/climateapps2.oucs.ox.ac.uk_cpdnboinc... Created shared memory region key = 24775 Env Used=LD_LIBRARY_PATH=/home/jim/CPDNboinc/projects/climateapps2.oucs.ox.ac.uk_cpdnboinc:/usr/local/lib:/usr/lib:/lib Copying files for startup... In pre_initialise_phase (part 1 of 3) In initialise_phase (part 2 of 3) In startup_phase (part 3 of 3) 2005-04-01 14:00:30 [climateprediction.net] Restarting result 1v12_300107757_0 using hadsm3 version 4.12 Starting model ID 1v12_300107757 Phase 1 Stack size=4096.00 MB Waiting for model startup, this may take a minute... 1v12_300107757 - PH 1 TS 000001 - 01/12/1810 00:30 - H:M:S=0005:18:30 AVG=19110.14 DLT= 0.00 "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Still not fatal, but three of the last five 4.13 Models started this weekend upchucked, all on TS 60. All on P4s, on SuSE 9.0 or 9.1 (the good part is that elapsed time is now reset to zero, so AVG= has meaning): 2m69_200143283 - PH 1 TS 000060 - 02/12/1810 06:00 - H:M:S=0000:02:43 AVG= 2.73 DLT= 0.91 CPDN Monitor got quit request... Detaching shared memory... 2005-05-01 14:55:52 [climateprediction.net] Result 2m69_200143283_0 exited with zero status but n o 'finished' file 2005-05-01 14:55:52 [climateprediction.net] If this happens repeatedly you may need to reset the project. 2005-05-01 14:55:52 [climateprediction.net] Restarting result 2m69_200143283_0 using hadsm3 versi on 4.13 Starting model in /home/jim/CPDNboinc/projects/climateapps2.oucs.ox.ac.uk_cpdnboinc... Created shared memory region key = 24485 Env Used=LD_LIBRARY_PATH=/home/jim/CPDNboinc/projects/climateapps2.oucs.ox.ac.uk_cpdnboinc:/usr/l ocal/lib:/usr/lib:/lib Copying files for startup... In pre_initialise_phase (part 1 of 3) In initialise_phase (part 2 of 3) In startup_phase (part 3 of 3) Starting model ID 2m69_200143283 Phase 1 Stack size=4096.00 MB Waiting for model startup, this may take a minute... 2m69_200143283 - PH 1 TS 000001 - 01/12/1810 00:30 - H:M:S=0000:00:00 AVG= 0.00 DLT= 0.00 "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Hmmm, what is it about Model start-ups? This time, the existing Run croaked, 47 TS into the new Run. Zeroed the existing Run's time, so AVG= garbage now. (P4 3.0, SuSE 9.0) 2ft3_300134952 - PH 1 TS 030470 - 05/09/1812 19:00 - H:M:S=0025:08:08 AVG= 2.97 DLT=11.00 CPDN Monitor got quit request... Detaching shared memory... 2005-05-02 16:05:32 [climateprediction.net] Result 2ft3_300134952_0 exited with zero status but n o 'finished' file 2005-05-02 16:05:32 [climateprediction.net] If this happens repeatedly you may need to reset the project. Starting model in /home/jim/CPDNboincSM/projects/climateapps2.oucs.ox.ac.uk_cpdnboinc... Created shared memory region key = 25150 Env Used=LD_LIBRARY_PATH=/home/jim/CPDNboincSM/projects/climateapps2.oucs.ox.ac.uk_cpdnboinc:/usr /local/lib:/usr/lib:/lib 2005-05-02 16:05:32 [climateprediction.net] Restarting result 2ft3_300134952_0 using hadsm3 versi on 4.13 Starting model ID 2ft3_300134952 Phase 1 Stack size=4096.00 MB Waiting for model startup, this may take a minute... 2ft3_300134952 - PH 1 TS 030385 - 04/09/1812 00:30 - H:M:S=0000:00:00 AVG= 0.00 DLT= 0.00 2mhf_200143689 - PH 1 TS 000050 - 02/12/1810 01:00 - H:M:S=0000:02:38 AVG= 3.18 DLT=10.37 2mhf_200143689 - PH 1 TS 000051 - 02/12/1810 01:30 - H:M:S=0000:02:40 AVG= 3.15 DLT= 2.00 2mhf_200143689 - PH 1 TS 000052 - 02/12/1810 02:00 - H:M:S=0000:02:41 AVG= 3.11 DLT= 1.00 2mhf_200143689 - PH 1 TS 000053 - 02/12/1810 02:30 - H:M:S=0000:02:42 AVG= 3.07 DLT= 1.00 2mhf_200143689 - PH 1 TS 000054 - 02/12/1810 03:00 - H:M:S=0000:02:44 AVG= 3.05 DLT= 2.00 2mhf_200143689 - PH 1 TS 000055 - 02/12/1810 03:30 - H:M:S=0000:02:45 AVG= 3.01 DLT= 1.00 2ft3_300134952 - PH 1 TS 030386 - 04/09/1812 01:00 - H:M:S=0000:00:12 AVG= 0.00 DLT=12.01 2ft3_300134952 - PH 1 TS 030387 - 04/09/1812 01:30 - H:M:S=0000:00:13 AVG= 0.00 DLT= 1.85 "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
©2024 cpdn.org