Questions and Answers :
Windows :
All work units received since 1-Aug-18 get a "Computation error"
Message board moderation
Author | Message |
---|---|
Send message Joined: 31 Aug 04 Posts: 2 Credit: 4,520,346 RAC: 0 |
The work units run for varying lengths of time so I'm burning through a lot of CPU time with out getting any credits since the project came back on line on 1-Aug-18. Here's a typical work unit; https://www.cpdn.org/cpdnboinc/workunit.php?wuid=11584208. And here's another; https://www.cpdn.org/cpdnboinc/workunit.php?wuid=11606997. The work units also appear to fail on other computers. Is this a known issue, and is there something I can do about it? Thank you. |
Send message Joined: 7 Aug 04 Posts: 2183 Credit: 64,822,615 RAC: 5,275 |
I started looking at that PC's failures, from the ones around early Aug until now. At first I thought that maybe you had incredibly bad luck with the SAM25 models, which have a pretty high failure rate over all. But then I saw that you had PNW, CAM and CAF failures as well. All the SAM and CAM failure were signal 11 while the PNW ones weren't. Did anything change on your PC around August 1st? Besides basic maintenance such as blowing out the air ducts with compressed air when the PC is shutdown and ensuring that there is some space between the vents and the surface it is on, you could whitelist the BOINC program files and data folders from antivirus scanning. On the failures that occurred after quite some time and some returned trickles, the stderr event log has lots of "suspends" in the log. cpdn tasks are more prone to failure when there are lots of suspends. In the computing preferences in BOINC manager, you should un-tick "Suspend when computer is in use" and "Suspend when non-BOINC usage is above xx percent" and tick "Leave non-GPU tasks in memory when suspended". Finally, you might want to set the climateprediction.net project to no new tasks, remove it from boinc manager, then re-add it in order to make sure there are no corrupted files in the projects/climateprediction.net directory. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I started looking at that PC's failures, from the ones around early Aug until now. At first I thought that maybe you had incredibly bad luck with the SAM25 models, which have a pretty high failure rate over all. But then I saw that you had PNW, CAM and CAF failures as well. All the SAM and CAM failure were signal 11 while the PNW ones weren't. I am not sure of the difference between a "signal 11" failure and anything else. But I had a pnw25 fail recently without signal 11. It may have been due to lack of space on my ramdisk; I was not around at the time to check it. https://www.cpdn.org/cpdnboinc/result.php?resultid=21263860 It seems that otherwise each of RogerM's failures can be explained by bad luck (very bad luck). It is very strange. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It may also be overclocking. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Do you Suspend BOINC, then Exit BOINC before shutting down the computer? Do you allow Windows to apply updates while climate models are running? |
Send message Joined: 1 Sep 04 Posts: 161 Credit: 81,512,201 RAC: 928 |
I am having the same problem (Signal 11) on one of my computers. I know there were a lot of segment violations in the past, but as I remember, most of those were on LINUX machines. But, mine is a Windows 10 Laptop. I am not overclocking, CPU temp is reasonable, not installing Windows updates, and not suspending work. https://www.cpdn.org/cpdnboinc/show_host_detail.php?hostid=1317652 |
Send message Joined: 7 Aug 04 Posts: 2183 Credit: 64,822,615 RAC: 5,275 |
I am having the same problem (Signal 11) on one of my computers. I know there were a lot of segment violations in the past, but as I remember, most of those were on LINUX machines. But, mine is a Windows 10 Laptop. Looks like all your recent failures were SAM25 models from batch 742. All of those have had at least one failure on another PC before you downloaded them. This batch has a high failure rate relative to a lot of other batches. But, I appear to have been lucky so far with 4 completions and 3 more running with at least one trickle with no failures from that batch. You're running 2 EU25's now so hopefully you'll have more luck with them. |
Send message Joined: 7 Dec 07 Posts: 1 Credit: 13,223,647 RAC: 967 |
I am having the same issues, Most of the recent ones are WAH, and they keep having errors. No overclocking, and over half the time I am not even on the computer while it is working. |
Send message Joined: 7 Aug 04 Posts: 2183 Credit: 64,822,615 RAC: 5,275 |
I am having the same issues, Most of the recent ones are WAH, and they keep having errors. No overclocking, and over half the time I am not even on the computer while it is working. Most of the errors on that computer are on SAM25 models with signal 11 errors. The SAM25 models are very sensitive and quite a few computers are having those problems. Hopefully you'll pick up some of the different WAH2 regions from now on. |
©2024 cpdn.org