Questions and Answers :
Unix/Linux :
March is a Lost Month
Message board moderation
Author | Message |
---|---|
Send message Joined: 26 Aug 04 Posts: 13 Credit: 458,996 RAC: 0 |
Since March 5th, every model run I've downloaded (7 and counting) have failed before the end of phase 1, usually around credit 1134.21 with a Client Error. This is 2 different Linux hosts, each running model 4.11. Before the jump from 4.04 to 4.10 to 4.11, almost all my model runs finished normally, no I haven't had one in almost a month. Is this a misconfig, or are the models just buggy? |
Send message Joined: 7 Aug 04 Posts: 2183 Credit: 64,822,615 RAC: 5,275 |
This is a problem with the Linux CPDN client. See other recent posts to this forum for more complaints about stability of 4.11. |
Send message Joined: 26 Aug 04 Posts: 13 Credit: 458,996 RAC: 0 |
> This is a problem with the Linux CPDN client. See other recent posts to this > forum for more complaints about stability of 4.11. > I figured as much. Looking at my machines, I have 2 more model runs that just started up on 4.11, I'm sure they will die soon enough. Does anyone know how to move a work unit from one machine to another? I have a slow machine that has almost a years worth of 4.04 models yet to do on it, that I would like to move to faster machines that are starving for work. But I dont' want to tar up the whole dir so as to replace the identity. I'd just like to copy the proper client_state.xml and whatever else is needed so the faster machines will pick up the copied model as if it was originally given to them. I've attempted to copy selected parts of stuff before, but boinc doesn't seem to like my changes and blows them off. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,193,804 RAC: 2,852 |
> This is a problem with the Linux CPDN client. See other recent posts to this > forum for more complaints about stability of 4.11. > I am running 4.13 and most of them crash too. I have not gotten to phase 2 in I do not remember how long. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,193,804 RAC: 2,852 |
> > This is a problem with the Linux CPDN client. See other recent posts to > this > > forum for more complaints about stability of 4.11. > > > I am running 4.13 and most of them crash too. I have not gotten to phase 2 in > I do not remember how long. > P.S.: Mainly complains No Heartbeat in 31 seconds... |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,193,804 RAC: 2,852 |
> > This is a problem with the Linux CPDN client. See other recent posts to > this > > forum for more complaints about stability of 4.11. > > > I am running 4.13 and most of them crash too. I have not gotten to phase 2 in > I do not remember how long. > This BOINC is really frustrating. It has been running 3 instances of climateprediction on my machine for the last several days, yet no trickles at all in about 5 days. Furthermore, this machine has 2 hyperthreaded 3.06GHz Xeon processors, so it should be running four applications most of the time. It seldom does, although once in a while it runs five, which it should not. So the BOINC client is, IMAO, defective for one thing. And I guess the 4.13 application, or its data, are bad too since I never get out of Phase 1 anymore. I used to. |
Send message Joined: 17 Aug 04 Posts: 753 Credit: 9,804,700 RAC: 0 |
> It has been running 3 instances of climateprediction on my machine for the > last several days, yet no trickles at all in about 5 days. Nobody has had any trickles credited since 18 April. This is a server fault, and should not affect upload or result in data being lost. So unlees there are messages indicating a problem communicating with the server you can ignore this particular problem for now. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,193,804 RAC: 2,852 |
> > > This is a problem with the Linux CPDN client. See other recent > posts to > > this > > > forum for more complaints about stability of 4.11. > > > > > I am running 4.13 and most of them crash too. I have not gotten to phase > 2 in > > I do not remember how long. > > > P.S.: Mainly complains No Heartbeat in 31 seconds... > I am still getting mostly failures; here is a typical one: Server state Over Outcome Client error Client state Computing Exit status 251 (0xfb) Host ID 45631 Report deadline 16 Mar 2006 17:41:39 UTC CPU time 370877.60 stderr out 4.19 process exited with code 251 (0xfb) 1 0 No heartbeat from core client for 31 sec - exiting No heartbeat from core client for 31 sec - exiting I am running a Dual Hyperthreaded Xeon system with 4 GBytes RAM that is up 24/7. Does everyone experience this, or should I do something? If so, what? |
©2024 cpdn.org