climateprediction.net (CPDN) home page
Thread 'Waiting for model to start... forever'

Thread 'Waiting for model to start... forever'

Questions and Answers : Unix/Linux : Waiting for model to start... forever
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user10205

Send message
Joined: 2 Sep 04
Posts: 4
Credit: 4,367
RAC: 0
Message 2779 - Posted: 3 Sep 2004, 0:31:04 UTC
Last modified: 3 Sep 2004, 6:06:45 UTC

I downloaded the new Boinc client (4.05) and setup the climate project. The downloads and installations appeared to run well, but when the model starts, it hangs.

"Waiting for model startup, this may take a minute...
06x7_100033997 - PH 1 TS 000001 - 00/00/0000 00:00 - H:M:S=0000:00:00 AVG= 0.00 DLT= 0.00"

No further activity or messages.
All seems to have halted.

Does anyone know what's going on?
Should I kill this project and wait a few months for the coders to debug things?

Further information:
Below is the log after I waited most of the day:
Something is messed up with this project:
___________________________________________________________

Model crashed...retrying...restart level 0
Preparing for restart...
Rewinding a model-day...
Starting model ID 1qlb_000101949 Phase 1
Stack size=4096.00 MB
Waiting for model startup, this may take a minute...
1qlb_000101949 - PH 1 TS 000001 - 00/00/0000 00:00 - H:M:S=0000:00:00 AVG= 0.00 DLT= 0.00
Model timeout at 180.00 seconds
Model crashed...retrying...restart level 1
Preparing for restart...
Rewinding a model-month...
Error: Restart files for dataout/restart.month not found
Giving up, this result exceeded crash count for available restart files.
adding: ncatts.cpdc (deflated 72%)
adding: climate.cont (deflated 79%)
adding: climate.cpdc (deflated 79%)
adding: climate.doub (deflated 79%)
adding: climate.spin (deflated 79%)
adding: 1qlb_000101949.xml (deflated 65%)
adding: ncatts.cpdc (deflated 72%)
adding: ncatts.cpdc (deflated 72%)
adding: ncatts.cpdc (deflated 72%)
adding: stderr_um.txt (deflated 75%)
adding: yabsd.out (deflated 93%)
adding: restart.day (deflated 43%)
2004-09-02 20:19:00 [climateprediction.net] Unrecoverable error for result 1qlb_000101949_0 (process exited with code 251 (0xfb))
2004-09-02 20:19:00 [climateprediction.net] Unrecoverable error for result 1qlb_000101949_0 (process exited with code 251 (0xfb))
2004-09-02 20:19:00 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2004-09-02 20:19:00 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2004-09-02 20:19:00 [climateprediction.net] Computation for result 1qlb_000101949 finished
2004-09-02 20:19:00 [climateprediction.net] Started upload of 1qlb_000101949_0_1.zip
2004-09-02 20:19:00 [climateprediction.net] Started upload of 1qlb_000101949_0_2.zip
2004-09-02 20:19:02 [climateprediction.net] Finished upload of 1qlb_000101949_0_1.zip
2004-09-02 20:19:02 [climateprediction.net] Throughput 1341 bytes/sec
2004-09-02 20:19:02 [climateprediction.net] Started upload of 1qlb_000101949_0_3.zip
2004-09-02 20:19:03 [climateprediction.net] Finished upload of 1qlb_000101949_0_2.zip
2004-09-02 20:19:03 [climateprediction.net] Throughput 17717 bytes/sec
2004-09-02 20:19:03 [climateprediction.net] Started upload of 1qlb_000101949_0_4.zip
2004-09-02 20:19:04 [climateprediction.net] Finished upload of 1qlb_000101949_0_4.zip
2004-09-02 20:19:04 [climateprediction.net] Throughput 2033 bytes/sec
2004-09-02 20:19:04 [climateprediction.net] Started upload of 1qlb_000101949_0_5.zip
2004-09-02 20:19:07 [climateprediction.net] Finished upload of 1qlb_000101949_0_5.zip
2004-09-02 20:19:07 [climateprediction.net] Throughput 27202 bytes/sec
2004-09-02 20:19:07 [climateprediction.net] Finished upload of 1qlb_000101949_0_3.zip
2004-09-02 20:19:07 [climateprediction.net] Throughput 870 bytes/sec
2004-09-02 20:20:01 [---] Insufficient work; requesting more
2004-09-02 20:20:01 [climateprediction.net] Requesting 259191 seconds of work
2004-09-02 20:20:01 [climateprediction.net] Sending request to scheduler: http://climateapps2.oucs
.ox.ac.uk/cpdnboinc_cgi/cgi
2004-09-02 20:20:06 [climateprediction.net] Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpd
nboinc_cgi/cgi succeeded
2004-09-02 20:20:06 [climateprediction.net] Message from server: No work available (daily quota exceeded)
ID: 2779 · Report as offensive     Reply Quote
old_user4563

Send message
Joined: 31 Aug 04
Posts: 3
Credit: 55,768
RAC: 0
Message 2797 - Posted: 3 Sep 2004, 5:18:28 UTC

I'm having the same problem I believe. I've loaded boinc v 4.x on three linux machines. Two of them redhat 9, and one Yoper 2.1. When I start boinc it displays that message, but upon looking at top the process is zombied and the system is idle.
ID: 2797 · Report as offensive     Reply Quote
Rainer Emrich
Avatar

Send message
Joined: 26 Aug 04
Posts: 3
Credit: 6,367,024
RAC: 0
Message 2852 - Posted: 3 Sep 2004, 12:31:47 UTC

I have the same problem since today. I tried to attach the project to some additional machines, which have the same configuration as a machine that runs cpdn since yesterday morning.

In the workunit directory exist a file stderr_um.txt.

stderr_um.txt:
forrtl: info: Fortran error message number is 63.
forrtl: warning: Could not open message catalog: ifcore_msg.cat.
forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/ifcore_msg.cat.
forrtl: info: Fortran error message number is 63.
forrtl: warning: Could not open message catalog: ifcore_msg.cat.
forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/ifcore_msg.cat.
forrtl: info: Fortran error message number is 17.
forrtl: warning: Could not open message catalog: ifcore_msg.cat.
forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/ifcore_msg.cat.

This file is empty on the machine that's working


I hope someone of the team is watching the forum, because that's a little bit strange.

Cheers

Rainer
ID: 2852 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 2856 - Posted: 3 Sep 2004, 12:40:35 UTC
Last modified: 3 Sep 2004, 12:40:59 UTC

Your problems <i>might</i> be related to a Visual Fortran error that's been afflicting the windows build recently. Seems that some workunits have gone out with a duff file.

Check out <a href="http://www.climateprediction.net/board/viewtopic.php?t=2296&amp;p=20006#20006">this thread</a> on the phpBB forum.

And thanks to <b>sjokela</b> for doing the investigative work and <b>UK_Nick</b> for providing a link to the file that gives a workaround for the problem :)

<a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 2856 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : Waiting for model to start... forever

©2024 cpdn.org