Message boards : Cafe CPDN : Computation error
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jun 07 Posts: 3 Credit: 985,430 RAC: 967 |
Most of my BOINC projects take a few minutes to a few hours, but CP takes days to run, so I was bummed that after two days (blocking other projects because "Switch between tasks every N minutes" isn't working) my two CP tasks aborted simultaneously with "Computation error" after a restart, despite checkpointing. So I'm sorry to have to abandon CP; I don't want it wasting more cycles. These were my first two tasks after rejoining CP since being away a few years; I don't remember why I left, perhaps for the same reason. 16GB iMac MacOS 10.13.6. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
BOINC has to be Suspended, and then Exited BEFORE any computer restart. And the only model type that's currently available for the Mac are very touchy anyway. Also, see Why Macs are on the way out at cpdn at the top of the Macintosh section. Thanks for trying. This isn't an easy project to handle. |
Send message Joined: 15 Jun 07 Posts: 3 Credit: 985,430 RAC: 967 |
I meant a restart of BOINC; Mac wasn't rebooted. But I would think (naively?) the last CP checkpoint should *always* survive *any* kind of restart. Thanks for your efforts; I'm a volunteer board mod elsewhere. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
That's another of the many problems with this project - the large number of files that are open. I don't recall anyone having looked into it, but "check pointing" may not be the same as "all files saved". Stopping / shutting down parts of it, may just happen to occur while some of the files are still waiting to be saved. Back when there were still graphics, with some info about the model's state on it, I used to wait until the countdown timer (to next checkpoint), showed zero, and then a few more, before I Suspended that model. And each model was Suspended individually, before Suspending BOINC, and then Exiting BOINC. I don't know how much overkill this was, but it worked, and it didn't take long. The new modelling programs seem to need lots of tlc for certain types of OS, and certain versions of the OS. e.g. Windows 10 may be the cause of a lot of the failures with the South American models, (sas25), a lot of which fail at about 3 minutes. But on my Linux Mint computers, running the latest version of WINE, with a Windows version of BOINC, I don't have that problem. |
©2024 cpdn.org