Questions and Answers :
Windows :
App crasched when opening \"Show graphics\"
Message board moderation
Author | Message |
---|---|
Send message Joined: 31 Jan 05 Posts: 5 Credit: 85,146 RAC: 0 |
Which shouldn't be a problem, just restart the whole thing, but when I did that the project ended when I had 40 hours left out of over 800 hours). And the app started to download new zip-files and the project says it is starting a new project, which it hasn't done yet since the progress won't leave 0%.... Can I finish the old project or should i just let it continue on the new one.... Rather annoying since it was just about 6 % left to compute... ________________________________ |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
Do you have a backup of the boinc folder from before the crash? If so copying it back should work. Visit BOINC WIKI for help And join BOINC Synergy for all the news in one place. |
Send message Joined: 31 Jan 05 Posts: 5 Credit: 85,146 RAC: 0 |
Hrrmmff... no... I don't... but I will start doing that from today... Still the folders and files in dataout-folder of the interrupted project seem ok since they're divided into a number of files and shouldn't it be possible to start from the latest file that seems ok? And if so, where can I find info of doing that? ________________________________ |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
> Hrrmmff... no... I don't... but I will start doing that from today... > Still the folders and files in dataout-folder of the interrupted project seem > ok since they're divided into a number of files and shouldn't it be possible > to start from the latest file that seems ok? And if so, where can I find info > of doing that? > Sorry it isn't possible. Most files you see are averages and don't contain the detail. The restart files contain most of the detail and may be ok but unfortunately there is also info in the clientstate.xml file which is also needed and is now lost. BTW The old unit is still of some use to the scientists. Visit BOINC WIKI for help And join BOINC Synergy for all the news in one place. |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
> unfortunately there is also info in the clientstate.xml file which is also needed and is now lost. There is an outside chance that it's not too late to recover the job. If you can find the string <b>45dv_000215544</b> in 8 different sections of your client_state.xml or client_state_prev.xml file take a copy of the relevant file, post another message and I'll try to help you out. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 31 Jan 05 Posts: 5 Credit: 85,146 RAC: 0 |
Ok, I understand.... and as you said... it's lost since I couldn't find the string... Thank you for your time and I have now scheduled a backup running every night on the boinc folder.... ;-) Cheers ________________________________ |
Send message Joined: 23 Feb 05 Posts: 55 Credit: 240,119 RAC: 0 |
According to the Workunit page, this model has received "over" status and has been given to a new user to compute. 516610 99588 31 Jan 2005 11:32:42 UTC 9 Mar 2005 9:33:23 UTC Over Client error Computing 2756732.51 6332.67 618533 27407 12 Mar 2005 1:44:21 UTC 12 Mar 2005 2:45:36 UTC Over Client error Computing 0.00 0.00 Your 760 hours cpu-time are rendered useless as soon as an other user finnishes this particaliar model-run. I believe that this could have been avoided if the boinc client was reliable ! Making a backup, once daily, is for now the best practice to cover up for the ability of the client to work the CPDN models. An other possiblity is to make copies of the client_state(_prev).xml files, as soon as you encounter some kind of problem. Than copy the client_state_prev.xml to the original client_state.xml file and recycle the machine. If you are lucky than the model will continue on restarting the client. If it does not, revert to the backup or consider your work lost ! N.B. If I read between the lines of the Project Statistics, than I estimate that roughly half of the assigned credit is wasted on incomplete models. It is my assumption that a great deal of this wasted resources is due to these client-errors. A bit off topic, maybe, but I wonder what the equivalent-value is to the cups of tea which where heated up and where left untouched to cool down to ambient conditions again ? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
> Your 760 hours cpu-time are rendered useless as soon as an other user finnishes this particaliar model-run. Not necessarily. The scientists like to compare the same model run on different processors / op systems, to check for consistancy, and 2/3rds of a run to compare is better than none. > I believe that this could have been avoided if the boinc client was reliable ! Surprisingly, most of the problems lie with people's computers. Lots have said: "My computer is brand new, so it must be OK." Or similar. But when they have performed the checks and tests recommended to them, it often turns out to a heat problem. Which is caused by MANY things. And lots of people have NO trouble completing models. On different processsors and op systems. Myself included. As for wasted credit, credit doesn't count here. Only trickles and completed models. What gets wasted a lot, are parameter sets, and a lot of this is due to people running their machines automatically, without checking to see if the program is producing results. >A bit off topic, maybe, but I wonder what the equivalent-value is to the cups of tea which where heated up and where left untouched to cool down to ambient conditions again ? I agree with this. Good example. Mine is that it's like a huge jigsaw puzzle, with lots of the pieces missing, probably permanently. Les |
Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0 |
> N.B. If I read between the lines of the Project Statistics, than I estimate > that roughly half of the assigned credit is wasted on incomplete models. It is > my assumption that a great deal of this wasted resources is due to these > client-errors. > I agree with what Les has said. My estimate is that 76% of model years are eventually ending up in completed runs. For the classic client it was only 72% so BOINC is more stable than the non BOINC version. I think we will see further improvement in stability as better techniques get tried out in alpha testing of sulphur cycle model (and pre alpha testing of coupled model?) then brought back to the public release. Also some improvement as people get more used to the intensive work of CP. However, I do not see a vast improvement on the 76% being possible as a lot of this relates to the stability of the computers being used. |
Send message Joined: 23 Feb 05 Posts: 55 Credit: 240,119 RAC: 0 |
> My estimate is that 76% of model years are > eventually ending up in completed runs. For the classic client it was only 72% > so BOINC is more stable than the non BOINC version. Very unscientific to draw conclusions based on estimates. If you think that that boinc client is so stable, than support your call with calculations on the boinc stats. As a gesture I'm willing to set my estimate from rouhgly half to 60+%, but I think 76% seems to me wishfull thinking. But than if it was 76% - if 76% of airplanes would not make it to the other side of the Atlantic, you would not see me flying across ! Even if you would tell me that the airplaines where perfectly safe. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
crandles is talking about the pre-BOINC CP results as compared with BOINC CP results. This was discussed last year, in the classic forum, which is still dead. Les |
Send message Joined: 31 Jan 05 Posts: 5 Credit: 85,146 RAC: 0 |
Just a thought.... Why in Saskatchewan doesn't the client save the client_state.xml file as unique with date and time instead of overwriting the two versions, actual och previous...??? Then you could go back to the file before the crash and continue even without backing up your data (which you of course should do anyway)... I mean, the xml files is 16 kb each, which is nothing compared to the amount of space the projects files take up. So it wouldn't be anything you noticed anyhow... Couldn't this easily made dirty solution increase the number of succeding WUs? ________________________________ |
Send message Joined: 23 Feb 05 Posts: 55 Credit: 240,119 RAC: 0 |
Very good point Lunkster From the 760 hours you completed, mayby only the last few bits are corrupt. Restarting the client earlier than where it stoped, should overwrite the corrupt data and would therfore have no impact on the model-data. Furtermore it would preserver your efforts. Was your machine in any way exposed to more heat than usual ? I doubt it, but to be sure I would like to hear that from you. Regards, Eric. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Lunkster, I think the client_state files are part of BOINC, so you would need to ask them. Also, I think they have a suggestions / bug report forum. V4.25 has 2 links to their site builtin. But it is an idea. Les |
Send message Joined: 31 Jan 05 Posts: 5 Credit: 85,146 RAC: 0 |
I have no indications of overheating.... what hapened was that I hade a few programs running, and when (by curiousity) I wanted to open the globe to se the patterns of the model "Show graphics" that client went frozen and after a while when I was wondering what happened the client starts by saying the WU was ended and reported what you can see on the projects history. So for me it was something that conflicted insede the client, but that my unprofessional hypothesis.. ;) ________________________________ |
Send message Joined: 23 Feb 05 Posts: 55 Credit: 240,119 RAC: 0 |
> I have no indications of overheating.... what hapened was that I hade a few > programs running, and when (by curiousity) I wanted to open the globe to se > the patterns of the model "Show graphics" that client went frozen and after a > while when I was wondering what happened the client starts by saying the WU > was ended and reported what you can see on the projects history. So for me it > was something that conflicted insede the client, but that my unprofessional > hypothesis.. ;) > Prof or unprof, it seems to me you got the picture right. B.t.w. can I show my respect for your perseverance. Think I wouldn't have started running a new model, if the same would happen to me. Actualy, plan to delay the next model-run untill release of a fixed client. My first model ended after 30 hours, due to my own wrong doing *, but if the present one finnishes uncompleted, I'm done. Looked into the source-code overview, yesterday, in relation to clientstate. Second inpressions is that things are not as simple as depicted earlier. State file might be updated quite often and read from other state files or alike. Best would be to have a procedure in place which, on error, goes back a few timestaps and tries from there instead of the present "abort on error" This procedure is already aplied to models that run out of bounderies or experiance model crashes. * Using task manager, I killed one of the two hadsm... tasks, because the "Grapics Window" was taking more cpu resources than it would take before. Result was that both Graphics and client ended and so on. |
©2024 cpdn.org