climateprediction.net (CPDN) home page
Thread 'Misconfiguration e-mail'

Thread 'Misconfiguration e-mail'

Message boards : climateprediction.net Science : Misconfiguration e-mail
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 25 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 44716 - Posted: 15 Aug 2012, 22:37:52 UTC - in response to Message 44715.  

For starters, this is your list of models. As you can see, everything is failing.
The more recent have been for REPLANCA errors, which affected everyone at the time.

As for the other failures, I'm not sure. They seem to be a long way into the model, and the usual cause may not apply.
But this thread is about "the usual suspect". And, as Apple keep tightening the security measures with each OS upgrade, it has to be re-applied each time.
I believe that a fix may be in the testing stage for our models.

Apparently the very latest version of BOINC for Macs has a cure for some problems. I'm not sure what they're up to, as the BOINC site is down for building maintenance, but I think it's 7.0.31 or 32.

Updating to this may help.


Backups: Here
ID: 44716 · Report as offensive     Reply Quote
Peter Hugk

Send message
Joined: 1 Oct 11
Posts: 4
Credit: 888,758
RAC: 0
Message 44744 - Posted: 18 Aug 2012, 19:08:00 UTC - in response to Message 44716.  

I'm now on BOINCManager 7.0.31, detached this project and re-attached it. All current jobs were downloaded again and started freshly. At around 12 % three of the models crashed again. So it looks like there is another problem than just the "standard" one.
ID: 44744 · Report as offensive     Reply Quote
Androidd

Send message
Joined: 4 Dec 08
Posts: 27
Credit: 651,211
RAC: 0
Message 44745 - Posted: 18 Aug 2012, 19:10:36 UTC - in response to Message 44674.  

Ok I got the libraries installed per the sticky and have run all updates to make sure there wasn't anything that might have been missed.

Sorry for the multiple posts had a noob moment :/


Hope this got everything back up and running

Thanks
ID: 44745 · Report as offensive     Reply Quote
old_user635728

Send message
Joined: 9 Oct 10
Posts: 1
Credit: 446,045
RAC: 0
Message 44748 - Posted: 23 Aug 2012, 5:23:09 UTC


Dear tempo
Your computer (host # 1202949) described below appears to have a misconfigured BOINC
installation and is crashing models. Would you please have a look at it?

If you need assistance, please post in this thread on our BOINC forums and we will suggest a way to fix the problem. You may post in any language:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=6880

Please include this link so that we may more easily find your computer:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1202949

When you have applied the fix please post to say so. Until the problem is fixed no more work will be sent to your computer.


Sincerely,
The climateprediction.net team


Our database entry for this computer is below. It is for your information only and you do not need to paste it into your comment on the forum.
ID: 1202949
Created: 24 Feb 2012 15:04:47 UTC
Venue:
Total credit: 0
Average credit: 0
Average update time: 21 Aug 2012 20:48:14 UTC
IP address: xxx.xxx.xxx.xxx (same the last 393 times)
Domain name: zeppelin
Local Time = UTC +0 hours
Number of CPUs: 64
CPU: GenuineIntel Intel(R) Xeon(R) CPU X7550 @ 2.00GHz [Family 6 Model 46 Stepping 6]
FP ops/sec: 2429473851.03699
Int ops/sec: 12495720345.4846
memory bandwidth: 1000000000
Operating System: Linux 2.6.32-279.2.1.el6.x86_64
Memory: 1033790.69 MB
Cache: 18432 KB
Swap Space: 194503.78 MB
Total Disk Space: 121.05 GB
Free Disk Space: 108.51 GB
Avg network bandwidth (upstream): 43180.086151 bytes/sec
Avg network bandwidth (downstream): 517548.264177 bytes/sec
Average turnaround: 0 days
Number of RPCs: 416
Last RPC: 22 Aug 2012 15:00:18 UTC
% of time client on: 98.6515 %
% of time host connected: -100 %
% of time user active: 94.5788 %
# of results today: 0


Ok, I'm thinking that this issue may be due to missing 32bits libraries. Can you confirm ?
ID: 44748 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 44749 - Posted: 23 Aug 2012, 7:24:29 UTC - in response to Message 44748.  
Last modified: 23 Aug 2012, 7:31:24 UTC

Ok, I'm thinking that this issue may be due to missing 32bits libraries. Can you confirm ?

The error message in stderr for your tasks is:
error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory

This is happening on 6 of your computers (1202949, 1202950, 1202951, 1202953, 1202957 and 1202959) and the solution is linked from this sticky.

It's likely that error is being generated before the applications hit the point where they access the 32-bit libraries (links to that solution in this sticky).

Three of your computers (1202952, 1202955 and 1231875) seem to be running tasks successfully but are failing to run the post-processing phase because libz.so.1 is missing:
Unable to load library hadam3p_eu_se_6.09_i686-pc-linux-gnu.so
dlopen error: libz.so.1: cannot open shared object file: No such file or directory

"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 44749 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 44753 - Posted: 27 Aug 2012, 15:32:47 UTC

Windows/AMD laptop: 1177286.
ID: 44753 · Report as offensive     Reply Quote
Peter Hugk

Send message
Joined: 1 Oct 11
Posts: 4
Credit: 888,758
RAC: 0
Message 44762 - Posted: 30 Aug 2012, 17:24:52 UTC - in response to Message 44744.  

Meanwhile I tested running 6 models without shutdown of my computer. That did work without an error. After that I returned to shutting down the computer late in the evening and the "193" error happen again crashing the models.
Is there any advice you can give me to overcome that problem?
I do not want to leave my computer on all the time.
ID: 44762 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 44763 - Posted: 30 Aug 2012, 18:12:26 UTC

Peter,
Do you first suspend CPDN tasks? Many files are open and simply cutting power typically doesn't allow time (sufficient residual power) to close all files; that results in a crash on restart.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 44763 · Report as offensive     Reply Quote
Adi

Send message
Joined: 25 Feb 05
Posts: 4
Credit: 13,605,157
RAC: 0
Message 44765 - Posted: 30 Aug 2012, 19:20:24 UTC

Dear Adi
Your computer (host # 419870) described below appears to have a misconfigured BOINC
installation and is crashing models. Would you please have a look at it?

If you need assistance, please post in this thread on our BOINC forums and we will suggest a way to fix the problem. You may post in any language:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=6880

Please include this link so that we may more easily find your computer:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=419870

When you have applied the fix please post to say so. Until the problem is fixed no more work will be sent to your computer.



Aaaa... Help? :)


I see that every task this computer received since 23 Feb 2009 (this is the oldest one I can see on your site) have some errors.

Maybe it wasn't OK from the beginning? A reset, detach, reattach might help?

A review of my computers shows that many/all of them have errors for some types of CP applications, for example:
UK Met Office HADAM3P European Region v6.09
UK Met Office HADAM3P Southern Africa v6.09
UK Met Office HADAM3P ...

The newest computers have 0 credit for CP, but have credits for other BOINC projects, without errors. All this new ones have Centos 6 x86_64, and enough CPU power, RAM and HDD space.

So please review ALL my computers, or at least the ones active in last 30 days, and give me some advice.

Thank you
ID: 44765 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 44766 - Posted: 30 Aug 2012, 20:01:12 UTC

Adi,

All of your tasks are failing with errors like the following (visible by clicking on the '+' on the stderr line of this page):

hadam3p_saf_6.09_i686-pc-linux-gnu: /usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by hadam3p_saf_6.09_i686-pc-linux-gnu)


That's indicating the model requires a more recent version of that library than you have on your system. You might find this post helpful.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 44766 · Report as offensive     Reply Quote
Adi

Send message
Joined: 25 Feb 05
Posts: 4
Credit: 13,605,157
RAC: 0
Message 44767 - Posted: 30 Aug 2012, 20:55:45 UTC - in response to Message 44766.  

Thank you for your quick reply.

I'll try to use the solution posted in the forum you pointed at.
I'll post the results.
ID: 44767 · Report as offensive     Reply Quote
Jay Levenson

Send message
Joined: 12 Dec 07
Posts: 1
Credit: 1,363,669
RAC: 0
Message 44768 - Posted: 30 Aug 2012, 21:16:19 UTC - in response to Message 39186.  

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=944884
ID: 44768 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 44769 - Posted: 30 Aug 2012, 21:32:51 UTC
Last modified: 30 Aug 2012, 21:47:20 UTC

@Jay Levenson

It looks like a problem with Mac hosts when upgrading boinc versions. See this sticky in the Mac forum of this site for a solution (detach the host, then reattach to cpdn).
ID: 44769 · Report as offensive     Reply Quote
HoopRat

Send message
Joined: 19 May 06
Posts: 1
Credit: 2,222,678
RAC: 485
Message 44771 - Posted: 31 Aug 2012, 1:05:01 UTC

Received the note below... interesting since work is running on my computer. Hmmm!

Anyway, am posting as per instructions, because I'd like to help in anyway I can.

<MV>

---



Dear HoopRat
Your computer (host # 1214304) described below appears to have a misconfigured BOINC
installation and is crashing models. Would you please have a look at it?

If you need assistance, please post in this thread on our BOINC forums and we will suggest a way to fix the problem. You may post in any language:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=6880

Please include this link so that we may more easily find your computer:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1214304

When you have applied the fix please post to say so. Until the problem is fixed no more work will be sent to your computer.


Sincerely,
The climateprediction.net team

ID: 44771 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 44775 - Posted: 31 Aug 2012, 5:56:38 UTC - in response to Message 44771.  

Hooprat,

That Linux PC is crashing many models. The stderr messages on the crashed models has a line about "execv". This is a symptom of a PC with a 64 bit distribution of Linux not having 32 bit compatibility libraries installed. See this sticky in the Linux forum for links on how to install the compatibility libraries.
ID: 44775 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 44776 - Posted: 31 Aug 2012, 7:28:14 UTC - in response to Message 44771.  

HoopRat,

The other error reported on your tasks (e.g. this one, errors visible by clicking on the '+' on the stderr line) is:
sched_setscheduler: Operation not permitted

That's indicating that BOINC can't set the project application's scheduling priority to batch (idle) which is very strange as all users should be able to lower the priority.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 44776 · Report as offensive     Reply Quote
Mike

Send message
Joined: 10 May 07
Posts: 2
Credit: 1,586,935
RAC: 0
Message 44777 - Posted: 31 Aug 2012, 9:14:59 UTC

Dear Mike
Your computer (host # 1230696) described below appears to have a misconfigured BOINC
installation and is crashing models. Would you please have a look at it?

If you need assistance, please post in this thread on our BOINC forums and we will suggest a way to fix the problem. You may post in any language:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=6880

Please include this link so that we may more easily find your computer:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1230696

When you have applied the fix please post to say so. Until the problem is fixed no more work will be sent to your computer.

I've tried install 32bit libraries, was that the issue?
ID: 44777 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 44779 - Posted: 31 Aug 2012, 9:37:08 UTC - in response to Message 44777.  
Last modified: 31 Aug 2012, 9:39:17 UTC

I've tried install 32bit libraries, was that the issue?

32 bit libraries might be a problem Mike, but the stderr messages for your failed tasks show that the project applications are failing to find the libstdc++.so.6 library first.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 44779 · Report as offensive     Reply Quote
Mike

Send message
Joined: 10 May 07
Posts: 2
Credit: 1,586,935
RAC: 0
Message 44780 - Posted: 31 Aug 2012, 9:51:34 UTC - in response to Message 44779.  

Ok, I've installed the libstdc++.so.6.0.13 library, I hope it'll work.
ID: 44780 · Report as offensive     Reply Quote
ProfileThyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 44781 - Posted: 31 Aug 2012, 10:17:43 UTC - in response to Message 44780.  
Last modified: 31 Aug 2012, 12:29:22 UTC

Passed up to the project team for re-enabling of work fetch Mike.

Edit: Andy has now done that.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 44781 · Report as offensive     Reply Quote
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 25 · Next

Message boards : climateprediction.net Science : Misconfiguration e-mail

©2024 cpdn.org