Questions and Answers : Windows : Job crashing without resuming
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Mar 06 Posts: 4 Credit: 72,449 RAC: 0 |
Currently I use BOINC 5.2.13 running two CPDN jobs, and it\'s very unstable at my Opteron 275 workstation, WinXP Pro. Sometimes it crashes every few hours with the same problem: Visual Fortran runtime error. After a few crashes it resumed the jobs but yesterday one of the jobs (hadcm3lb_5gz2_05043258) resumed while the other (hadcm3lb_5gz3_05043259) simply disappeared. I have its directory untouched but referring XML file within BOINC\\projects\\climateprediction.net\\ disappeared too. When BOINC connected to the CPDN server it requested a new job (hadcm3lb_592q_05033022) and tried to run it. I aborted the new job and want to resume the previous one. So what should I do to resume it? I completed two CPDN runs using the old good .NET client and everything was almost perfect. But your brand-new-shiny-state-of-the-art BOINC seems to be a piece of crap like everything else from Berkeley :( |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
If you want hadcm3lb_5gz3_05043259 back, then start praying for a miracle, because it\'s crashed, and gone for good. result page for hadcm3lb_5gz3_05043259. |
Send message Joined: 27 Mar 06 Posts: 4 Credit: 72,449 RAC: 0 |
Very sad. Error code seems to be corresponding to this: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=4231 Anyway, for the future, may it help to prevent such situations if I make daily automated backups of BOINC directory and restore it from backup after similar crashes? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes, backups are a great idea. At one point last month I was making a backup every 12 hours, until I got past whatever the problem was. For a while, I didn\'t know if my middle name was \"Backup\", or \"Paranoid\". The only problem with an automated backup, is that BOINC MUST be exited before the backup, or else the files will get out of sync. And there\'s at least one \'lockfile\'. |
Send message Joined: 27 Mar 06 Posts: 4 Credit: 72,449 RAC: 0 |
Thanks a lot. I\'ve already found the lockfiles and automated backup process. All we need are two scheduled tasks: \'boinccmd.exe --quit\' before the backup time and \'boinc.exe\' after it. But it shows the greatest problem of BOINC: casual people having no serious motivation (and having no wish to deal with permanent bugs) would quit the project after a couple of similar crashes. Actually I experienced troubles simply trying to run it becuase one of BOINC processes locked all downloaded data files and didn\'t want to release them in order to start computing; it took about one hour to solve this problem and make BOINC work. So I wouldn\'t be surprised hearing that CPDN loses participants becuase of BOINC. |
Send message Joined: 13 Sep 04 Posts: 228 Credit: 354,979 RAC: 0 |
But it shows the greatest problem of BOINC: casual people having no serious motivation (and having no wish to deal with permanent bugs) would quit the project after a couple of similar crashes. Actually I experienced troubles simply trying to run it becuase one of BOINC processes locked all downloaded data files and didn\'t want to release them in order to start computing; it took about one hour to solve this problem and make BOINC work. So I wouldn\'t be surprised hearing that CPDN loses participants becuase of BOINC. Never heard of such a problem; how did you solve it? |
Send message Joined: 27 Mar 06 Posts: 4 Credit: 72,449 RAC: 0 |
Never heard of such a problem; how did you solve it? I didn\'t find any appropriate way to solve it. Fortunately it was solved thanks to the last and the most stupid thing I could do: I uninstalled BOINC and installed it again. And then everything started. |
Send message Joined: 11 Jun 05 Posts: 67 Credit: 1,222,916 RAC: 0 |
How do you back-up successfully? My CPDN model has just crashed. [Second time. First one got up to 168 hours. This has just crashed after 125 hours.....huh?!? I might give up - waste of electricity if the program is so unstable that it looses all data when it crashes.] Re-booted PC and it has gone from the BOINC Manager. Do you just copy the ClimatePrediction.net folder from the D:\\Programs\\Boinc\\projects\\climateprediction.net directory? simple as that? Neil. Yes, backups are a great idea. At one point last month I was making a backup every 12 hours, until I got past whatever the problem was. For a while, I didn\'t know if my middle name was \"Backup\", or \"Paranoid\". |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Do you just copy the ClimatePrediction.net folder from the D:\\Programs\\Boinc\\projects\\climateprediction.net directory? Yes. But only after you menu/Exit from BOINC. Otherwise the many files get out of sync, and the backup is useless. After a restore, do a re-boot, to clear out memory of any left overs from the crash. Your models are failing for a variety of reasons, including: exit code -1073741819, which is a graphics problem, and a couple with an exit code of 1. (forget what that is.) It may help to read the sticky right at the top of this \"Windows\" help board. For the graphics, an update of the driver from the card makers web site sometimes fixs it. But for graphics heavy programs, you either need more ram, or to menu/Suspend BOINC before using them. |
©2025 cpdn.org