climateprediction.net home page
Just lost a work unit that was over half done...

Just lost a work unit that was over half done...

Questions and Answers : Windows : Just lost a work unit that was over half done...
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user6698

Send message
Joined: 31 Aug 04
Posts: 3
Credit: 9,726
RAC: 0
Message 5911 - Posted: 5 Nov 2004, 9:25:47 UTC

I never did correct that problem that they sent out an email about a couple months ago. My computer was still processing the same work unit I had downloaded in August.

climateprediction.net - 2004-11-04 21:47:56 - Unrecoverable error for result 01u2_500027363_0 ( - exit code -5 (0xfffffffb))
climateprediction.net - 2004-11-04 21:47:56 - Deferring communication with project for 1 minutes and 0 seconds
climateprediction.net - 2004-11-04 21:47:56 - Computation for result 01u2_500027363 finished
LHC@home - 2004-11-04 21:47:56 - Restarting result v64lhc1000proeleven-51s12_14560.03_1_sixvf_42542_2 using sixtrack version 4.47
climateprediction.net - 2004-11-04 21:47:56 - Started upload of 01u2_500027363_0_1.zip
climateprediction.net - 2004-11-04 21:47:56 - Started upload of 01u2_500027363_0_2.zip
climateprediction.net - 2004-11-04 21:48:47 - Finished upload of 01u2_500027363_0_2.zip
climateprediction.net - 2004-11-04 21:48:47 - Throughput 13959 bytes/sec
climateprediction.net - 2004-11-04 21:48:47 - Started upload of 01u2_500027363_0_3.zip
climateprediction.net - 2004-11-04 21:48:49 - Finished upload of 01u2_500027363_0_1.zip
climateprediction.net - 2004-11-04 21:48:49 - Throughput 10544 bytes/sec
climateprediction.net - 2004-11-04 21:48:49 - Started upload of 01u2_500027363_0_4.zip
--- - 2004-11-04 21:48:57 - Insufficient work; requesting more
climateprediction.net - 2004-11-04 21:48:57 - Requesting 138791 seconds of work
climateprediction.net - 2004-11-04 21:48:57 - Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
climateprediction.net - 2004-11-04 21:49:01 - Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
climateprediction.net - 2004-11-04 21:49:53 - Finished upload of 01u2_500027363_0_3.zip
climateprediction.net - 2004-11-04 21:49:53 - Throughput 11832 bytes/sec
climateprediction.net - 2004-11-04 21:49:53 - Started upload of 01u2_500027363_0_5.zip
climateprediction.net - 2004-11-04 21:50:18 - Finished upload of 01u2_500027363_0_5.zip
climateprediction.net - 2004-11-04 21:50:18 - Throughput 11195 bytes/sec
climateprediction.net - 2004-11-04 21:50:18 - Started download of hadsm3_4.04_windows_intelx86.exe
climateprediction.net - 2004-11-04 21:50:39 - Finished upload of 01u2_500027363_0_4.zip
climateprediction.net - 2004-11-04 21:50:39 - Throughput 14089 bytes/sec

And then it downloaded a whole new work unit. Did this happen to anyone else? Was it because I had not yet gone through those steps they emailed me?

BOINC 4.13 / WinXP SP2 / AMD Athlon XP 2100+
ID: 5911 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 5912 - Posted: 5 Nov 2004, 9:52:57 UTC
Last modified: 5 Nov 2004, 9:54:09 UTC

Exit code -5 is a catch all and may indicate a machine or OS problem. It's definitely not related to the problem you were emailed about, which concerned a spurious error on upload of some WUs when one of the files is reported as too large. So don't beat yourself up about that.

If your computer is otherwise stable then you can probably put this one down to bad luck. 7 seconds a timestep is slow for an Athlon 2100 though, and you may be running the program with the screensaver and/or graphics window open, which tends to slow things down and gives something else to go wrong. The usual advice is not to use the screensaver, and open the graphics only when you want to look at it. It's also good practice to suspend the application before closing BOINC, as this sometimes gives problems.

Is there anything else that may be causing CPDN to run slowly? Are you running a lot of CPU or memory intensive programs, for example, that may be making your PC struggle a bit? The relevance of all this is that these things may not only give a clue as to what caused the program to fail, but also the longer you take over a WU, the greater the chance of something going wrong in the course of a particular WU.

It's disheartening to have a WU fail at this stage. You could make a practice of keeping backups of the BOINC folder, but I'm afraid there's nothing to be done about the old WU now.
ID: 5912 · Report as offensive     Reply Quote
old_user1154

Send message
Joined: 25 Aug 04
Posts: 6
Credit: 205,686
RAC: 0
Message 5914 - Posted: 5 Nov 2004, 12:44:50 UTC - in response to Message 5912.  

I had exact the same problem last weekend.

At about 40% completed (400 hrs of work gone).
Standard P IV 2GHz XP SP2 .
So it's not machine or OS related.
ID: 5914 · Report as offensive     Reply Quote
old_user2616

Send message
Joined: 29 Aug 04
Posts: 4
Credit: 219,510
RAC: 0
Message 5941 - Posted: 7 Nov 2004, 1:40:02 UTC - in response to Message 5912.  

> Exit code -5 is a catch all and may indicate a machine or OS problem. It's
> definitely not related to the problem you were emailed about, which concerned
> a spurious error on upload of some WUs when one of the files is reported as
> too large. So don't beat yourself up about that.
>
> If your computer is otherwise stable then you can probably put this one down
> to bad luck. 7 seconds a timestep is slow for an Athlon 2100 though, and you
> may be running the program with the screensaver and/or graphics window open,
> which tends to slow things down and gives something else to go wrong. The
> usual advice is not to use the screensaver, and open the graphics only when
> you want to look at it. It's also good practice to suspend the application
> before closing BOINC, as this sometimes gives problems.
>
> Is there anything else that may be causing CPDN to run slowly? Are you running
> a lot of CPU or memory intensive programs, for example, that may be making
> your PC struggle a bit? The relevance of all this is that these things may not
> only give a clue as to what caused the program to fail, but also the longer
> you take over a WU, the greater the chance of something going wrong in the
> course of a particular WU.
>
> It's disheartening to have a WU fail at this stage. You could make a practice
> of keeping backups of the BOINC folder, but I'm afraid there's nothing to be
> done about the old WU now.
>

Well i have exactly the same problem, only that every WU breaks up that way. I tried allready several times. My PC runs absolutly stable and it doesn´t do much other Work that uses a lot of CPU. My AMD xp2600 runs WU with about 3.1s and that I believe is normal. Sorry but I can´t put that one down to bad luck, I´d put it down to bad aplication. Anyhow, Boinc is still in beta so we can't realy complain. Your answer to the problem sounds more like a "try this and that". I am tired of loosing WUs and others are too, I think if they don't find a way to cut the huge WUs into little ones (wich I read before is pretty dificult) they will loose a lot of people. My PC is running 24/7 and I'm begining to ask myself for what. I'll get tons of trickles and never end a workunit. Backing up the boinc folder doesn't do any good. I tried that and when I restore it the damn thing ignores everything and downloads a new WU. Hopefully someone comes up with a real solution to the problem!
ID: 5941 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 5945 - Posted: 7 Nov 2004, 9:28:55 UTC

The situation is very similar to CPDN classic, where some machines consistently run the application without any problems for months on end, and some don't.

It's understandable if people in the latter camp just give up and do something different. There are rarely simple answers where PCs are concerned; they are very complex pieces of hardware, and that's before you get into questions about the BIOS and OS. Nevertheless, experience suggests that consistent problems with CPDN really do come down to problems with the PC. Climate modelling is about as demanding a task as any PC is asked to undertake, but it is successfully undertaken on a wide variety of machines.

That a computer runs other software successfully isn't, though, a sufficient indication that it will cope with CPDN. The advice given by UK_Nick <a href="http://www.climateprediction.net/board/viewtopic.php?t=2126">here</a> is as valid as when it was first compiled.

Not every problem can be laid at BOINC's door.
ID: 5945 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 5946 - Posted: 7 Nov 2004, 15:16:14 UTC - in response to Message 5941.  

it does seem to be AMDs more likely to crash for whatever reason. It doesn't mean your computer is "bad" -- just that something in the way this huge software project is made over the million lines of code just doesn't like a particular user's computer. Mac's on the other hand, seem to be OK, since they're a bit more homogeneous; as opposed to every damn PC out there is different!

ID: 5946 · Report as offensive     Reply Quote
old_user169

Send message
Joined: 5 Aug 04
Posts: 39
Credit: 87,633
RAC: 0
Message 5948 - Posted: 7 Nov 2004, 16:09:04 UTC - in response to Message 5946.  

&gt; .... Mac's on the other hand, seem to be OK, since they're a bit more homogeneous; as opposed to every damn PC out there is different!


Avoids getting bored ;-)
ID: 5948 · Report as offensive     Reply Quote
Jord
Avatar

Send message
Joined: 5 Aug 04
Posts: 250
Credit: 93,274
RAC: 0
Message 6007 - Posted: 10 Nov 2004, 11:51:16 UTC - in response to Message 5912.  

&gt; You could make a practice
&gt; of keeping backups of the BOINC folder, but I'm afraid there's nothing to be
&gt; done about the old WU now.
&gt;
Backing up the BOINC folder won't work because the unit you've been working on has already told the server that it ran into problems, thus the server has unregistered this unit for you. As said, when you restore the backup, CPDN (and any other BOINC project which WU crashed out) will happily ignore this unit and download a new one/continue crunching the new one.
--------------------------------
Jordâ„¢

<img src="http://boinc.mundayweb.com/cpdn/stats.php?userID=2&amp;trans=off">
ID: 6007 · Report as offensive     Reply Quote

Questions and Answers : Windows : Just lost a work unit that was over half done...

©2024 cpdn.org