Message boards : Number crunching : HadAM3P HadRM3P PNW Visual Fortran failures
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Mar 09 Posts: 126 Credit: 9,825,980 RAC: 0 |
I picked up a bunch of these this morning. 15 of them failed after 4 seconds CPU time with Visual Fortran errors. This is across 3 separate machines (all the same config, dedicated BOINC crunchers). Looking at them the wingman has also failed after 4 seconds so I don't think its just me. I left some of them going for 10 hours (elapsed time) and they show up in BOINCtasks as zero CPU time, no checkpoint and using 48-52Mb memory. The ones that work have non-zero CPU time, do checkpoints and are using around 148-162Mb of memory. I decided I needed to access the machines after they didn't appear to progress and sure enough the Visual Fortran popups were there. I have 5 more that seem to be running across the 3 machines. Links to some of them: No 1 No 2 No 3 Edit Looking through the Visual Fortran thread that was for different models it would seem Windows and Intel iGPU's seem to be a common denominator. These machines have (but weren't using) Intel HD Graphics 4000. I don't use the BOINC screensaver or look at the model's graphics BOINC blog |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
My understanding is that a service installation of BOINC will not display the message boxes. The errors will still occur but silently. The message boxes sometimes occur because of a local problem on the machine, in which case the model will probably continue if the problem is transient. However, they also happen because a model is in the process of crashing and will crash on all similar machines, in which case the model will also crash in service mode. The service installation therefore reduces the amount of manual intervention and allows the machine (and you) to get on with something useful. |
Send message Joined: 28 Mar 09 Posts: 126 Credit: 9,825,980 RAC: 0 |
Thanks Iain. Unfortunately if I do service mode install I would lose the ability to use the iGPU for crunching, even though it wasn't doing any at that time. From what I gather in the other Visual Fortran thread its to do with the graphics app not working with the Intel iGPU under Windows. Is this correct? BOINC blog |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
It may well be that a form of that message arises from the graphics but models that never use the graphics also get the message. I've lost track of which applications do or don't have graphics or which graphics actually work on which platform, so I never start the graphics but still get that error from time to time. My normal practice is to do the service install but a BOINC version problem could be worked around by switching out of service mode. Distributed computing isn't supposed to be this difficult ... |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Some PNW retreads downloaded to my machines and most failed. Some showed the popup but not all -- some simply showed 'Running' in Status but did nothing and accumulated no time. They were summarily aborted. All the faulty tasks were in "w" series. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
©2024 cpdn.org