Message boards : Number crunching : finish file present too long
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,700,823 RAC: 9,977 |
The other common factor - for the tasks that have a full MS debugger log - is that the final cause of failure is something like - Unhandled Exception Record -It's not easy to decode those addresses back to the originating module in the multiple sources, and I'm certainly no expert. But it might be another line of attack. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
Dave, it's because the timeout on the finish_file was too short in those earlier boinc versions. I think Richard mentioned in an earlier post it's since been raised to 10mins which seems to solve it for busy systems. And it's the busy systems that are constantly suspending/resuming that seem to have the problem. We were wondering whether to impose some kind of limit on the boinc version in the task XML but personally I am reluctant to make the system any more complicated than it is already. Richard, I had a look at debug output. I know what's going on. For this scenario, it appears the monitor code is being asked by the client to do something to tidy up it's already done and it then fails somewhere. I've created an issue to look into it but it's not the highest priority. --- CPDN Visiting Scientist |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,700,823 RAC: 9,977 |
Yes, PR 3019 on 12 Feb 2019. That comes in the timeline between v7.14 and v7.16 |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,700,823 RAC: 9,977 |
An afterthought which occurred to me during my lunchtime walk. v7.14.2 is the last version which was compiled and distributed for 32-bit versions of Windows. People running those would be stuck without an upgrade route. But I've looked, and all the tasks mentioned in this thread are running under 64-bit versions of Windows. A 64-bit version of Windows will run 32-bit versions of BOINC - and my little celeron is an example of a machine which could only run 32-bit versions. I bought it for debugging that fault. BOINC doesn't seem to report which bitness is in use (unless it's buried deeper in the scheduler contacts), so this is probably a dead-end sidetrack. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
Interesting point. I have to use an older version of boinc because they abandoned 32bit builds a while ago, which I need as WaH is still 32bit. I've checked and I compile & link against boinc v7.20.2. This is the latest version that still includes the Visual Studio 32bit project files. It does have the code fix from the PR you mentioned. So the WaH side of things is ok? |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,700,823 RAC: 9,977 |
So far as I know. The error that kept my celeron (and similar low-power devices) locked into 32-bit mode was a crashing bug in a very old 64-bit version of the external SSL library that was being distributed with the early 64-bit versions of BOINC. Conceptually, it would be very easy to download the v7.14.2 release sources, apply the trivial relaxed time limit patch, and make a v7.14.3 client available to anyone who comes up with a sensible reason for using it. But the sort of volunteers that return tasks with these errors are probably not, shall we say, enthusiastic debuggers. |
©2024 cpdn.org