Questions and Answers :
Unix/Linux :
Fedora 7 Makes Me Mad
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
I had working graphics on Fedora 7. I did some updates a couple months ago via yum, and now starting graphics causes the hadcm3 executable to die! Since I have no idea what package caused this, does someone know what log file or error file I can look at to determine what to do next? Or has anyone experienced this problem? |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
There are various log files that different components of the climate model can generate, look for files starting with stdout_ or stderr_ and ending with a .txt file extension. Usually in the work unit\'s own folder. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
There are various log files that different components of the climate model can generate, look for files starting with stdout_ or stderr_ and ending with a .txt file extension. Usually in the work unit\'s own folder. stderr_um.txt file was zero bytes. This was the only thing that looked like an error: [starfox@localhost hadcm3inct_cmuo_1920_160_35869820]$ tail stdout_um4.txt 2035 points were -ve and the scaling factor has been reset to 1 QT_POS : Mass weighted QT summed over level 17 was negative. WARNING: QT not conserved 2042 points were -ve and the scaling factor has been reset to 1 QT_POS : Mass weighted QT summed over level 17 was negative. WARNING: QT not conserved 2154 points were -ve and the scaling factor has been reset to 1 QT_POS : Mass weighted QT summed over level 17 was negative. WARNING: QT not conserved 2050 points were -ve and the scaling factor has been reset to 1 |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
I don\'t think that\'s related, that looks like a general output from the climate model rather than an actual crash dump. Are you capturing stdout? (i.e., run_client & >capture_std_out.txt or something like that). That might show more info. Could you provide a link to one of the crashed models so we can see if anything relevant is shown on the website? (Your computers are hidden so we can\'t find anything). -- Edit: Does the same happen on the new version of Slab (in Beta)? (climateapps1.oucx.ox.ac.uk/beta) I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
I don\'t think that\'s related, that looks like a general output from the climate model rather than an actual crash dump. There\'s nothing indicating a crash in the boinc log files. Actually, BOINC thinks it\'s still running after starting graphics. The parent process doesn\'t die, just the child hadcm3 process. If I exit BOINC, it all terminates OK. But here\'s is a sample of what happens when I have to kill BOINC after graphics hangs (just ignore the SETI tasks): 2007-11-02 11:30:36 [---] Resuming computation 2007-11-02 11:30:36 [climateprediction.net] [task_debug] task_state=EXECUTING for hadcm3inct_cmuo_1920_160_35869820_1 from unsuspend 2007-11-02 11:30:36 [SETI@home] [task_debug] task_state=EXECUTING for 21mr07ai.8748.8661.11.6.5_1 from unsuspend 2007-11-02 11:30:36 [SETI@home] [task_debug] task_state=SUSPENDED for 21mr07ai.8748.8661.11.6.5_1 from suspend 2007-11-02 11:30:52 [---] Resuming network activity 2007-11-02 11:30:53 [SETI@home] [file_xfer] Started upload of file 21mr07ai.19725.9888.3.6.67_1_0 2007-11-02 11:30:56 [SETI@home] [file_xfer] Finished upload of file 21mr07ai.19725.9888.3.6.67_1_0 2007-11-02 11:30:56 [SETI@home] [file_xfer] Throughput 33362 bytes/sec 2007-11-02 11:30:57 [SETI@home] [task_debug] result state=FILES_UPLOADED for 21mr07ai.19725.9888.3.6.67_1 from CS::update_results 2007-11-02 11:31:01 [---] Suspending network activity - time of day Resuming CPDN! hadcm3inct_cmuo_1920_160_35869820 - PH 1 TS 3141937 A - 19/02/2042 00:30 - H:M:S=1917:56:18 AVG= 2.20 DLT= 1.00 2007-11-02 11:32:38 [climateprediction.net] [task_debug] result hadcm3inct_cmuo_1920_160_35869820_1 checkpointed 2007-11-02 11:34:31 [---] Suspending computation - user is active 2007-11-02 11:34:31 [climateprediction.net] [task_debug] task_state=SUSPENDED for hadcm3inct_cmuo_1920_160_35869820_1 from suspend 2007-11-02 11:34:42 [---] Exit requested by user 2007-11-02 11:34:47 [climateprediction.net] [task_debug] task_state=ABORTED for hadcm3inct_cmuo_1920_160_35869820_1 from kill_task 2007-11-02 11:34:47 [SETI@home] [task_debug] task_state=ABORTED for 21mr07ai.8748.8661.11.6.5_1 from kill_task 2007-11-02 11:45:45 [---] Starting BOINC client version 5.8.16 for i686-pc-linux-gnu 2007-11-02 11:45:45 [---] log flags: task, file_xfer, sched_ops, task_debug, unparsed_xml, benchmark_debug 2007-11-02 11:45:45 [---] Libraries: libcurl/7.16.0 OpenSSL/0.9.8d zlib/1.2.3 2007-11-02 11:45:45 [---] Data directory: /usr/local/boinc 2007-11-02 11:45:45 [---] [task_debug] result state=FILES_UPLOADED for 21mr07ai.19725.9888.3.6.67_1 from RESULT::parse_state 2007-11-02 11:45:45 [---] Processor: 2 AuthenticAMD AMD Opteron(tm) Processor 248 HE [Family 15 Model 37 Stepping 1][fpu vme de pse tsc ms 2007-11-02 11:45:45 [---] Memory: 1.96 GB physical, 2.00 GB virtual 2007-11-02 11:45:45 [---] Disk: 9.39 GB total, 5.61 GB free 2007-11-02 11:45:45 [climateprediction.net] URL: http://climateprediction.net/; Computer ID: 531684; location: home; project prefs: defaul 2007-11-02 11:45:45 [rosetta@home] URL: http://boinc.bakerlab.org/rosetta/; Computer ID: 555988; location: home; project prefs: default 2007-11-02 11:45:45 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 3025937; location: home; project prefs: default 2007-11-02 11:45:45 [---] General prefs: from climateprediction.net (last modified 2007-10-15 22:14:10) 2007-11-02 11:45:45 [---] Host location: home 2007-11-02 11:45:45 [---] General prefs: no separate prefs for home; using your defaults 2007-11-02 11:45:45 [---] Suspending network activity - time of day 2007-11-02 12:01:33 [---] [task_debug] ACTIVE_TASK::start(): forked process: pid 11031 2007-11-02 12:01:33 [climateprediction.net] [task_debug] task_state=EXECUTING for hadcm3inct_cmuo_1920_160_35869820_1 from start 2007-11-02 12:01:33 [climateprediction.net] Restarting task hadcm3inct_cmuo_1920_160_35869820_1 using hadcm3i version 541 2007-11-02 12:01:33 [---] [task_debug] ACTIVE_TASK::start(): forked process: pid 11032 2007-11-02 12:01:33 [SETI@home] [task_debug] task_state=EXECUTING for 21mr07ai.8748.8661.11.6.5_1 from start 2007-11-02 12:01:33 [SETI@home] Restarting task 21mr07ai.8748.8661.11.6.5_1 using setiathome_enhanced version 527 Beginning work on result hadcm3inct_cmuo_1920_160_35869820_1... Starting model in /usr/local/boinc/projects/climateprediction.net... Created shared memory region key = 177650 of size 655060 bytes (version 602) .so shmem return code = 0 Starting model ID hadcm3inct_cmuo_1920_160_35869820 Phase 1 Getting pthread attributes - retval=0 Setting pthread size (100663296 bytes) - retval=0 Executing program hadcm3transum_5.41_i686-pc-linux-gnu 177650 Program launched with process id # 11038 Climate model starting - use graphics to monitor progress. Or visit the website to see the graphs for this run. hadcm3inct_cmuo_1920_160_35869820 - PH 1 TS 3141937 A - 19/02/2042 00:30 - H:M:S=1917:56:18 AVG= 2.20 DLT= 0.00 2007-11-02 12:06:38 [SETI@home] [task_debug] result 21mr07ai.8748.8661.11.6.5_1 checkpointed 2007-11-02 12:11:41 [SETI@home] [task_debug] result 21mr07ai.8748.8661.11.6.5_1 checkpointed 2007-11-02 12:16:40 [SETI@home] [task_debug] result 21mr07ai.8748.8661.11.6.5_1 checkpointed hadcm3inct_cmuo_1920_160_35869820 - PH 1 TS 3142369 A - 25/02/2042 00:30 - H:M:S=1918:12:19 AVG= 2.20 DLT= 1.00 2007-11-02 12:18:03 [climateprediction.net] [task_debug] result hadcm3inct_cmuo_1920_160_35869820_1 checkpointed Here\'s the WU: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6049348 And my specific task: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6567767 If I don\'t display graphics, it seems to run OK so far. I have not downloaded another slab model since finishing this one a few weeks ago: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6826318 |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
Also, both the slab model and this coupled model (as of a couple months ago) displayed graphics properly. |
Send message Joined: 7 Aug 04 Posts: 2183 Credit: 64,822,615 RAC: 5,275 |
Not sure, but perhaps the updates a couple months ago broke some interaction with BOINC (at least the version you have). Perhaps try another yum update, and update to the latest version of BOINC? |
Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370 |
OMG, I got graphics to work! You wouldn\'t believe what is was. It was file permissions! I did a chmod -R 0775 on the whole project folder and graphics work now! Silly me. Perhaps the system updates had nothing to do with it. PS I wouldn\'t have found it except that I tried to download a slab model for testing. It errored out with code 22 -- file permissions! Bad news is I lost the new slab model at 0%, but the good news is the coupled model is working with graphics now. The CM should be done by Xmas. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Lucky you discovered that. It\'s a nuisance running a model without the graphics even though most of us only check up on them from time to time. Cpdn news |
©2024 cpdn.org