Message boards : Number crunching : WaH batches 996 & 1001 have been closed
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
The project scientists have enough results from batches 996 & 1001 to compare against results from the new version of the app running the identical batches 1006 & 1007. Comparison shows that the newer v8.29, recently recompiled, produced slightly warmer temperatures in the winter months, compared to the old version 8.24. The differences are not statistically significant (and not unexpected). WaH v8.29 is much more stable with very few hard fails and correctly restarts on a host power cycle. There will be no more new batches using WaH v8.24. --- CPDN Visiting Scientist |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,016,442 RAC: 21,024 |
WaH v8.29 is much more stable with very few hard fails and correctly restarts on a host power cycle. There will be no more new batches using WaH v8.24. Good news indeed! I will abort my 1001 resends. Well done for getting this sorted. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
Comparison shows that the newer v8.29, recently recompiled, produced slightly warmer temperatures in the winter months, compared to the old version 8.24. The differences are not statistically significant (and not unexpected). Good news! Is there a reason model result changes were "not unexpected"? Fixing correctness issues shouldn't alter results... I'd think... but the WaH stuff seems a bit special case as far as code goes.
Even better news! I won't miss Windows Update or a power outage trashing a CPU-month or two of work. Thank you so much for your work on improving this code! |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
Atmospheric models are non-linear in nature. New compilers can cause code optimization differences, new library versions, etc. Differences in model results can come from the cloud/convection code as one example. It tests to see if water vapour saturation in cloud free air is above a particular value, so a single bit difference can decide whether a cloud forms or not. Since a cloud represents a change of state, water vapour condenses out, that's a non-linear process and changes the properties of the air parcel and its environment.Comparison shows that the newer v8.29, recently recompiled, produced slightly warmer temperatures in the winter months, compared to the old version 8.24. The differences are not statistically significant (and not unexpected).Good news! Is there a reason model result changes were "not unexpected"? Fixing correctness issues shouldn't alter results... I'd think... but the WaH stuff seems a bit special case as far as code goes. --- CPDN Visiting Scientist |
Send message Joined: 16 Aug 04 Posts: 156 Credit: 9,035,872 RAC: 2,928 |
*rant* I just don't understand how a "Windows Update" can be allowed to stop and restart the system anytime, idiotic.. |
Send message Joined: 5 Aug 04 Posts: 127 Credit: 24,449,214 RAC: 23,620 |
WaH batches 996 & 1001 have been closedWhat does this mean in practice? Does it mean continue crunching will just be a waste of electricity, since anything returned is just dumped? Or is it still useful to continue crunching these until they either finish or crap-out on next re-boot? |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
Yeah, I understand they're exceedingly sensitive to disturbances and such, with non-linear follow on effects. I just work in a space where if the code generated different results based on the compiler, we'd be running down the bugs. But I also like to think x86 floating point, even vector, is well enough defined that you shouldn't get differences between chips, and I'm aware that's a falsehood - I just don't work in floating point spaces. Just interesting. I suppose if it's reordering some of the rounding operations and such you can get subtly different output from a series of operations. I like my computers deterministic, darn it! :p *rant* I just don't understand how a "Windows Update" can be allowed to stop and restart the system anytime, idiotic.. There's probably some way to disable it. I don't really "do Windows" anymore, so I'm not sure how to do it. I have Linux compute rigs, and when there's non-Linux work (Windows, 32-bit Intel Mac, etc), I spin up VMs for the duration of the work, and then destroy them when done, because I don't have enough disk space to store all of them on the compute rigs, and "copying VMs around between hosts" causes some very interesting failures when two systems are identical enough that they get the same computer ID and start smashing each other's work allocation. What's extra double special is that unless you change some other notification settings, it's likely to install updates, reboot, and then sit at the "But would you pretty please make an Online Microsoft Account????" nag screen (which doesn't allow any compute to start). No, you blasted OS, I created an offline account, through your increasingly troublesome process (now you have to actually not have a network connection at all to even see the option), because I wanted an offline account! |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,736,855 RAC: 4,073 |
You might as well abandon them as any results they return will be discarded. This will save you a bit of electricity, and reduce the workload on the project servers a little bit. |
Send message Joined: 6 Aug 04 Posts: 195 Credit: 28,342,480 RAC: 10,485 |
I've received a Batch 995 WAH 8.24 retread from July 2023. wah2_nz25_20aa_209105_25_995_012220768. https://www.cpdn.org/workunit.php?wuid=12220768 Should this WU be crunched or abandoned, please? Thank you. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
It's not 996 or 1001, so crunch it, far as I know. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
A closed batch doesn't send out retries. I've received a Batch 995 WAH 8.24 retread from July 2023. wah2_nz25_20aa_209105_25_995_012220768. --- CPDN Visiting Scientist |
©2024 cpdn.org