climateprediction.net (CPDN) home page
Thread 'many errors on my computer'

Thread 'many errors on my computer'

Message boards : Number crunching : many errors on my computer
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
NewtonianRefractor

Send message
Joined: 22 May 08
Posts: 49
Credit: 2,335,997
RAC: 0
Message 43772 - Posted: 11 Feb 2012, 19:35:55 UTC

My machine hostid=1170809 seems to error every HADAM3P model it tries, yet it processes the Coupled Model Full Resolution Ocean withput any problem.

Is there some issue with the machine?
ID: 43772 · Report as offensive     Reply Quote
ProfileGreg van Paassen

Send message
Joined: 17 Nov 07
Posts: 142
Credit: 4,271,370
RAC: 0
Message 43774 - Posted: 11 Feb 2012, 21:37:42 UTC

Hi NR,

try setting some debug options in cc_config.xml . That might help solve the problem.

Running boincmgr from a terminal window would allow you to see any boinc error messages. (Tip from GeoPhi.)

Apart from that, try detaching from the project, deleting the HadAM3_xxx binaries, and re-attaching.

HTH.
ID: 43774 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43778 - Posted: 12 Feb 2012, 0:23:53 UTC - in response to Message 43772.  

Dear Mr. Refractor, based on the error message, it looks like you're running too old a Linux distribution. See this series of posts from the beta site.

Upgrade and prosper.
ID: 43778 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 43779 - Posted: 12 Feb 2012, 3:35:51 UTC
Last modified: 12 Feb 2012, 3:40:09 UTC

There is a sticky in the Linux forum about this. Even though hadcm3n appears to complete successfully, the errors seen in stderr on the result page for those models are indicative of trouble zipping up the decadal upload files. I'm not sure that everything is getting up to the server, even though it says it completed successfully. hadam3p has immediate crashes on RedHat/CentOS 5 as you've seen.
ID: 43779 · Report as offensive     Reply Quote
NewtonianRefractor

Send message
Joined: 22 May 08
Posts: 49
Credit: 2,335,997
RAC: 0
Message 43789 - Posted: 13 Feb 2012, 22:42:44 UTC
Last modified: 13 Feb 2012, 22:48:00 UTC

Sorry for the late reply.

Unfortunately I don't have admin rights on this machine. It's running Scientific Linux 5.4 (release date November 4, 2009)and is managed by the IT department.

Does this mean that the machine is not producing useful scientific results for this project? If so, then unfortunately I will just have to detach the machine and attach it to some other project.
ID: 43789 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 43790 - Posted: 13 Feb 2012, 23:03:51 UTC - in response to Message 43789.  

Sometime last year the department for which our 2 project people work, upgraded their Linux compiler.
The new one uses a newer version of one of the library files than is used by some Linux distros, as per the sticky post to which George referred.

So, no you're NOT producing useful results for the regional models, only for the coupled ocean models, which aren't being offered at the moment.


Backups: Here
ID: 43790 · Report as offensive     Reply Quote
NewtonianRefractor

Send message
Joined: 22 May 08
Posts: 49
Credit: 2,335,997
RAC: 0
Message 43791 - Posted: 13 Feb 2012, 23:26:10 UTC - in response to Message 43790.  

Sometime last year the department for which our 2 project people work, upgraded their Linux compiler.
The new one uses a newer version of one of the library files than is used by some Linux distros, as per the sticky post to which George referred.

So, no you're NOT producing useful results for the regional models, only for the coupled ocean models, which aren't being offered at the moment.



So the hadcm3n return valid data even with the errors in the stderr?
ID: 43791 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 43792 - Posted: 13 Feb 2012, 23:44:39 UTC - in response to Message 43791.  

The one that I looked at said:
Over
Success
Done
so I didn't look at the error codes.

This is getting further into the Linux version of the models than I understand, so I'll pass on a message and see if someone else knows.


Backups: Here
ID: 43792 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43793 - Posted: 14 Feb 2012, 1:11:33 UTC
Last modified: 14 Feb 2012, 2:09:41 UTC

It looks like RHEL, CentOS and Scientific are pretty conservative. So while they aren't old distributions, their kernels and associated libraries are already 18 to 24 months old at release. If your machine isn't running a too retrictive SELinux, you could try obtaining a copy of libstdc++.so.6 and see if it will run with BOINC after an LD_PRELOAD statement.

Extract libstdc++.so.6.0.14 from this archive (click on the title link) and put it in a directory in your home folder, like /home/user-name/library-for-boinc. Rename it libstdc++.so.6 . Then in a terminal: LD_PRELOAD=/home/$USER/library-for-boinc/libstdc++.so.6 boinc .

Edit: added rename instruction.
ID: 43793 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43802 - Posted: 14 Feb 2012, 17:20:52 UTC

Okay, I've set up a Scientific 5.4 virtual machine here. I couldn't open the rpm that I linked (maybe has something to do with that compression issue geophi mentioned), so I used this slightly older version for CentOS, which extracted fine with the graphical tool.

I was able to start BOINC with the LD_PRELOAD statement, but downloads have been hit and miss this morning, so I haven't finished downloading a hadam3p yet. Also my floating point and integer benchmarks are crazy, so I either need to install the VirtualBox additions, or the libstdc++.so.6 isn't going to work. I will post more information later.
ID: 43802 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43805 - Posted: 15 Feb 2012, 14:15:48 UTC

Still waiting for this hadam3p to download due to the server issues. Meantime I fixed my benchmark results by adding divider=10 to to the guest's kernel parameters.
ID: 43805 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43807 - Posted: 15 Feb 2012, 17:40:47 UTC
Last modified: 15 Feb 2012, 18:11:11 UTC

It works! Even under SELinux set to enforcing.

The last rpm I linked did not work, so use this one instead (it will extract with the Scientific 5.4 gui tool.)

To summarize: I installed BOINC by running the shell script for version 6.10.58 in my home directory. I think this is the way the OP is set up, since he's not an admin. I extracted libstdc++.so.6.0.10 from the rpm to my desktop. Then in a terminal (replace "user" with your user name):

mkdir library-for-boinc

cp Desktop/libstdc++.so.6.0.10 library-for-boinc/libstdc++.so.6

cd BOINC

LD_PRELOAD=/home/user/library-for-boinc/libstdc++.so.6 nice -n 19 ./boinc &

./boincmgr &

You can adjust the nice setting to your liking, or leave it out altogether. The terminal needs to be left running with BOINC set up this way. I think this can also work with BOINC installed as a service, but the boinc init script would need to have the LD_PRELOAD statement added to the line that starts boinc, and the libstdc++.so.6 has to go someplace where the boinc user can access it (/etc/boinc-client would be good).

Edit: when starting out fresh on this version of Scientific Linux, you need this to get boincmgr working:

su (enter root password)

yum install libXcomposite
ID: 43807 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43812 - Posted: 16 Feb 2012, 5:39:29 UTC

Got a trickle (and 100 posts.)
ID: 43812 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43837 - Posted: 21 Feb 2012, 21:47:51 UTC

At the risk of turning this thread into a monologue, I'm reporting the successful completion of task 13967737. So in conclusion, one can run CPDN models on RHEL 5.1 to 5.5 (and its CentOS and Scientific derivatives) without changing the installed version libstdc++.so.6 (other applications will still use it happily.)
ID: 43837 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43843 - Posted: 22 Feb 2012, 18:04:16 UTC

One more thing: contrary to my earlier instructions about leaving the terminal open, you can actually close it and even log out of your desktop session and boinc will run fine in the background.
ID: 43843 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 43844 - Posted: 22 Feb 2012, 18:18:33 UTC - in response to Message 43843.  

Don't think that last bit works if you are running BOINC from your somewhere in your ~ directory.
ID: 43844 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43845 - Posted: 22 Feb 2012, 19:55:18 UTC

No illusions--it really worked. I had my BOINC directory in /home/user and closing the terminal seamlessly moved the parent process to 1. Note you have to start it with the ampersand at the end. I'll have try this on Ubuntu sometime soon.
ID: 43845 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 43846 - Posted: 22 Feb 2012, 20:58:21 UTC

I just tried it again and we're both right. You have to type 'exit' in the terminal, which is something I've gotten into the habit of doing for some reason.
ID: 43846 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 43852 - Posted: 23 Feb 2012, 9:26:27 UTC - in response to Message 43846.  
Last modified: 23 Feb 2012, 9:27:10 UTC

With Ubuntu - works after adding the ampersand at the end. Just means I don't get to look @ the errors in the terminal that I may at some point try and sort out. - They might hold clues as to why graphics don't work with BOINC7 for me.
ID: 43852 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 43855 - Posted: 23 Feb 2012, 14:11:47 UTC

Thanks Belfry. That worked for me on my CentOS 5.7 PC.

I did have to run "/bin/sh" first as it didn't recognize the LD_PRELOAD command from the terminal window with the default shell my user has (tcsh). Was the default shell for your user sh, or bash, or something else?
ID: 43855 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : many errors on my computer

©2024 cpdn.org