climateprediction.net home page
Multiple errors with new BOINC 4.09 - Lost Work Units

Multiple errors with new BOINC 4.09 - Lost Work Units

Questions and Answers : Windows : Multiple errors with new BOINC 4.09 - Lost Work Units
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user3885

Send message
Joined: 31 Aug 04
Posts: 1
Credit: 810,979
RAC: 0
Message 4960 - Posted: 2 Oct 2004, 21:07:51 UTC

Hello. I am running Windows XP SP2 on a Pentium 4 2.4GHZ, 512MB, Norton Antivirus. I am running BOINC 4.09 with SETI and ClimatePrediction at 50%, only to run when I am not using the system.

I am repeatedly seeing an error message on my system when I get up in the morning. I have DSL so am online all of the time. I also run Norton Antivirus 2004 overnights.

The error I see is the following on the screen in front of the BOINC screen that is still running:

hadsm3_4.04_windows_intelx86.exe. Application error - The instruction at \"0x7c910f29\" referenced memory at \"0x00000000\". Memory could not be \"read\".

Also, all of the Work that I have done in Climate Prediction has been lost and it has reset back to the beginning.

Here is my log:

climateprediction.net - 2004-10-02 03:56:20 - Restarting result 3101_000162689_1 using hadsm3 version 4.04
SETI@home - 2004-10-02 03:56:20 - Pausing result 01mr04ab.2990.9842.254826.68_2 (removed from memory)
climateprediction.net - 2004-10-02 04:56:20 - Pausing result 3101_000162689_1 (removed from memory)
SETI@home - 2004-10-02 04:56:24 - Restarting result 01mr04ab.2990.9842.254826.68_2 using setiathome version 4.03
SETI@home - 2004-10-02 05:56:24 - Pausing result 01mr04ab.2990.9842.254826.68_2 (removed from memory)
--- - 2004-10-02 09:10:03 - Suspending computation and network activity - user is active
climateprediction.net - 2004-10-02 09:10:09 - Unrecoverable error for result 3101_000162689_1 ( - exit code -1073741819 (0xc0000005))
climateprediction.net - 2004-10-02 09:10:09 - Deferring communication with project for 1 minutes and 0 seconds
--- - 2004-10-02 09:11:10 - Insufficient work; requesting more
climateprediction.net - 2004-10-02 09:11:10 - Requesting 132266 seconds of work
climateprediction.net - 2004-10-02 09:11:10 - Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
climateprediction.net - 2004-10-02 09:11:26 - Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
--- - 2004-10-02 09:25:31 - Resuming computation and network activity
climateprediction.net - 2004-10-02 09:25:31 - Computation for result èÏ¢ finished


What can I do? Is there a fix for this???

Kelly

ID: 4960 · Report as offensive     Reply Quote
Profile old_user2491

Send message
Joined: 28 Aug 04
Posts: 117
Credit: 21,096
RAC: 0
Message 4966 - Posted: 2 Oct 2004, 22:35:50 UTC

All I can say is that, I never run Norton when another program is running, that is just asking for problems.....


ID: 4966 · Report as offensive     Reply Quote
old_user18746

Send message
Joined: 17 Sep 04
Posts: 25
Credit: 196,284
RAC: 0
Message 4986 - Posted: 3 Oct 2004, 10:29:26 UTC
Last modified: 3 Oct 2004, 17:13:37 UTC

I have exactly the same problem, and I'm not using Norton at all, so I don't think it's related to that. My configuration is as follows:

Compaq Evo 620c notebook with Intel Centrino 1.4 GHz & 768MB RAM
Windows XP SP1
BOINC client version: 4.09
Running ClimatePrediction (45.45%), LHC@home (27.27%), SETI@home (27.27%)
Do work while computer is running on batteries? No
Do work while computer is in use? Yes
Do work only between the hours of: No restriction
Leave applications in memory while preempted? No

This problem started after I joined SETI & LHC. Before that I was running only ClimatePrediction, and encountered no problems at all. The model ran just fine up to 20% until I joined the other two projects. Perhaps this may be a hint...

Best regards,


Ertugrul.
ID: 4986 · Report as offensive     Reply Quote
old_user18746

Send message
Joined: 17 Sep 04
Posts: 25
Credit: 196,284
RAC: 0
Message 5001 - Posted: 3 Oct 2004, 17:27:30 UTC
Last modified: 10 Oct 2004, 15:34:35 UTC

One more detail: I had upgraded from 4.05 to 4.09.

To determine the exact cause of this problem I uninstalled BOINC, deleted the directory, and reinstalled 4.09 from scratch. After that I attached to CP and got some work. Now I'm running only CP (0.1% so far,) and I will wait until my CP WU is finished, if it is finished at all without a problem, of course... After that I will attach to other projects to see if the same problem occurs.

I will keep the forum updated about my progress.

Cheers,


Ertugrul.
ID: 5001 · Report as offensive     Reply Quote
old_user355

Send message
Joined: 7 Aug 04
Posts: 187
Credit: 44,163
RAC: 0
Message 5016 - Posted: 3 Oct 2004, 23:46:46 UTC - in response to Message 4966.  

> All I can say is that, I never run Norton when another program is running,
> that is just asking for problems.....

I run Norton while BOINC is running. Never had a problem.

<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=cpdn&amp;userid=355"><img border="0" height="80" src="http://355.cpdn.sig.boinc.dk?188"></a>
ID: 5016 · Report as offensive     Reply Quote
Profile old_user13493

Send message
Joined: 6 Sep 04
Posts: 1
Credit: 190,217
RAC: 0
Message 5020 - Posted: 4 Oct 2004, 3:39:10 UTC

I am also having a similar problem that happens while I sleep. I wake up and it will come across a "memory could not read" error. This has caused my model to crash several times. I have still yet to get a model to make it past 20% :/

I am running:
P4 2.4Ghz 1GB DDR-RAM
Windows XP SP2
BOINC Client Version: 4.09
25% SETI, 25% Climateprediction.net, 25% LHC, 25% Pirates

I have also noticed this error on my laptop. So I think there may be a bug when running on screensaver.
Some Words Of Wisdom: If someone borrows $20 from you and you don't see them again, it was probably worth it.
ID: 5020 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 5032 - Posted: 4 Oct 2004, 11:22:34 UTC
Last modified: 4 Oct 2004, 11:31:11 UTC

I'd guess that the problem (and solution) lies in the "pausing ... (removed from memory)" messages in Kelly's post. BOINC isn't very forgiving in the way it preempts projects at the moment, and projects can be swapped out in an unstable state. In the case of CPDN this can result in a model crash when the project next gets scheduled.

The only way to get round the problem is to change <b>Leave applications in memory while preempted?</b> in your general preferences to <b>yes</b>. I believe that this option is not available for all projects. If you save general preferences on a project without the option the setting will revert to its default of <b>no</b>.
<br><a href="http://www.teampicard.net"><img src="http://www.teampicard.net/templates/fisubice/images/phpbb2_logo.jpg"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 5032 · Report as offensive     Reply Quote
old_user18746

Send message
Joined: 17 Sep 04
Posts: 25
Credit: 196,284
RAC: 0
Message 5050 - Posted: 4 Oct 2004, 16:40:09 UTC - in response to Message 5032.  

&gt; I'd guess that the problem (and solution) lies in the "pausing ... (removed
&gt; from memory)" messages in Kelly's post. BOINC isn't very forgiving in the way
&gt; it preempts projects at the moment, and projects can be swapped out in an
&gt; unstable state. In the case of CPDN this can result in a model crash when the
&gt; project next gets scheduled.
&gt;

Something happened today which confirms the above hypothesis: CP was running on its own without any other projects, and was nearly 1% through the WU when I disconnected my notebook from mains to attend a meeeting. My client was set up not to preempt and not to run while running on batteries, so it tried to suspend the running WU and remove it from the memory, and guess what: CP crashed again with the same error message!

&gt; The only way to get round the problem is to change <b>Leave applications in
&gt; memory while preempted?</b> in your general preferences to <b>yes</b>.

No other projects, installed from scratch, so this must be it! Now I'm running another WU with preemption set to "yes". Let's see what happens...

&gt; I believe that this option is not available for all projects. If you save
&gt; general preferences on a project without the option the setting will revert to
&gt; its default of <b>no</b>.
&gt;

Quite reasonable... SAH and CP both have the option. I cannot confirm LHC, it's temporarily shut down.

Cheers,


Ertugrul.
ID: 5050 · Report as offensive     Reply Quote
old_user18746

Send message
Joined: 17 Sep 04
Posts: 25
Credit: 196,284
RAC: 0
Message 5184 - Posted: 10 Oct 2004, 15:33:40 UTC - in response to Message 5050.  

Unfortunately I got the same error today when the WU was 15% through, lost all the work, and had to start another CP WU all over again! It happened numerous times before, too, especially during shutdown, but somehow CP managed to recover its state and resume from where it left off. But today it crashed and lost the all of the work just after resuming from standby.

It seems that BOINC and/or CP cannot handle memory allocation/deallocation properly on laptops, especially when restarting, shutting down, resuming from standby, and removing the WU from memory when preemption is off.

I strongly suggest that you guys at BOINC and/or CP work on this problem ASAP, or else it might deter people from using CP on BOINC, including me! I see a lot of posts about this problem, and if I had dedicated my CPU cycles to other projects, I would have earned many many more credits from them in the meantime...
ID: 5184 · Report as offensive     Reply Quote
old_user18746

Send message
Joined: 17 Sep 04
Posts: 25
Credit: 196,284
RAC: 0
Message 5471 - Posted: 19 Oct 2004, 6:38:15 UTC - in response to Message 5184.  

I upgraded to BOINC v4.13 last week, and although I sometimes get the same error during shutdown, I'm 12% through the WU and I haven't lost any work since I upgraded, surviving numerous reboots, shutdowns and standbys.

I think this issue has been addressed in 4.13. Or should I wait until my WU is finished to be 100% sure? :-)

I will keep the forum updated about my progress.

Cheers,


Ertugrul.
ID: 5471 · Report as offensive     Reply Quote

Questions and Answers : Windows : Multiple errors with new BOINC 4.09 - Lost Work Units

©2024 cpdn.org