climateprediction.net home page
\'crashing\' all projects

\'crashing\' all projects

Questions and Answers : Unix/Linux : \'crashing\' all projects
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile KWSN - Sir Grawlfang

Send message
Joined: 16 Sep 04
Posts: 2
Credit: 765,509
RAC: 0
Message 11686 - Posted: 7 Apr 2005, 11:32:52 UTC

Hi,

A couple of time today now I've experienced the following problem :-

2521_300120880 - PH 1 TS 000050 - 02/12/1810 01:00 - H:M:S=0000:03:27 AVG= 4.16 DLT=14.96
2521_300120880 - PH 1 TS 000051 - 02/12/1810 01:30 - H:M:S=0000:03:29 AVG= 4.11 DLT= 2.00
2521_300120880 - PH 1 TS 000052 - 02/12/1810 02:00 - H:M:S=0000:03:31 AVG= 4.06 DLT= 1.50
2521_300120880 - PH 1 TS 000053 - 02/12/1810 02:30 - H:M:S=0000:03:33 AVG= 4.02 DLT= 2.00
CPDN Monitor got quit request...
Detaching shared memory...

When I try to cancel and then restart boinc (4.19) it seems to have lost work files for all of my projects and starts reloading from scratch :-

2005-04-07 12:43:11 [---] General prefs: from ProteinPredictorAtHome (last modified 2005-02-18 23:38:27)
2005-04-07 12:43:11 [---] General prefs: using separate prefs for home
2005-04-07 12:43:11 [ProteinPredictorAtHome] ACTIVE_TASKS::restart_tasks(); missing files
2005-04-07 12:43:11 [ProteinPredictorAtHome] ACTIVE_TASKS::restart_tasks(); missing files
2005-04-07 12:43:11 [ProteinPredictorAtHome] Unrecoverable error for result t0227E_1_91584_0 (One or more missing files)
2005-04-07 12:43:11 [ProteinPredictorAtHome] Unrecoverable error for result t0227E_1_91584_0 (One or more missing files)
2005-04-07 12:43:11 [Einstein@Home] ACTIVE_TASKS::restart_tasks(); missing files
2005-04-07 12:43:11 [Einstein@Home] ACTIVE_TASKS::restart_tasks(); missing files
2005-04-07 12:43:11 [Einstein@Home] Unrecoverable error for result H1_1408.4__1408.8_0.1_T25_Test02_1 (One or more missing files)
2005-04-07 12:43:11 [Einstein@Home] Unrecoverable error for result H1_1408.4__1408.8_0.1_T25_Test02_1 (One or more missing files)
2005-04-07 12:43:11 [climateprediction.net] ACTIVE_TASKS::restart_tasks(); missing files
2005-04-07 12:43:11 [climateprediction.net] ACTIVE_TASKS::restart_tasks(); missing files
2005-04-07 12:43:11 [climateprediction.net] Unrecoverable error for result 1buu_000082663_1 (One or more missing files)
2005-04-07 12:43:11 [climateprediction.net] Unrecoverable error for result 1buu_000082663_1 (One or more missing files)

This resets all of my projects (that's einstein, predictor and climateprediction).

Anyone else experienced such a problem with this combination of boinc projects (or any other) ?
Boinc was working just fine up until I attached to climateprediction (and predictor) last night.

My system, for info, is Slackware Linux 10, running a 2.4.26 kernel.

regards,
Mark



ID: 11686 · Report as offensive     Reply Quote
Profile old_user248

Send message
Joined: 6 Aug 04
Posts: 65
Credit: 1,605,224
RAC: 0
Message 11720 - Posted: 9 Apr 2005, 8:18:09 UTC

Are you still having this problem?

From what I understand reading posts in the other forum, the Hadsm3*_4.11 model had some stability problems and that there was a new build, 4.13 available earlier this week.

Do you have "Leave applications in memory while preempted?" on profiles page set to "yes"? This can have some effect on the stability of boinc in the M$ environment, I'm not 100% sure the effect in Linux.

I take it that you have already tried checking the hardware stability aspects, any overclock still 100% stable, hard drives OK, ran Prime95 torture test, etc? CPDN seems to stress PCs more than most of the other porjects and due to the longer timeframe and interdependancy of caclulations it is very suseptible to caclulation errors.

I am wondering about the "CPDN Monitor got quit request..." statement. I don't know much about the CPDN / boinc errors so don't know the whole origin of this, pehaps someone else does and can offer further guidance.

Dave
ID: 11720 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : \'crashing\' all projects

©2024 cpdn.org