Message boards : Number crunching : Naughty Intel patch - has anybody tried it?
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Aug 04 Posts: 17 Credit: 376,399 RAC: 0 |
One of the work units on my dual processor machine is in danger of being completed after its deadline. I was thinking of trying Swallowtail\'s naughty Intel patch on 5.08 to try and speed things up. http://www.swallowtail.org/naughty-intel.html Are the climateprediction executables compiled with the Intel compiler? I remember something about the application being written in Fortran. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes, it is compiled with an Intel Fortran compiler. No, there isn\'t a project deadline, just a BOINC deadline, whose use is a requirement of the software, but is ignored by cpdn. |
Send message Joined: 29 Aug 04 Posts: 17 Credit: 376,399 RAC: 0 |
Yes, it is compiled with an Intel Fortran compiler. Hmm, sounds like it just might work then. Let me give it a whack and I\'ll let you know how well it did. |
Send message Joined: 29 Aug 04 Posts: 17 Credit: 376,399 RAC: 0 |
Okay, I\'ve just patched the executables and restarted BOINC manager. The executables seem to be running and getting CPU time right now. hadcm3trans_5.08_windows_intelx86.exe was patched in 3 places hadcm3transse_5.08_windows_intelx86.exe was patched in 3 places hadcm3transum_5.08_windows_intelx86.exe was patched in 6 places I\'m not really sure how to measure performance between before and after the patches. I guess I\'ll take a look at my RAC graph in about a week or so, and let you know how it goes. |
Send message Joined: 16 Aug 04 Posts: 156 Credit: 9,035,872 RAC: 2,928 |
The coupled model is compiled with Intel Fortran 9 and there are no SSE optimisations anymore since they were not stable. To check the speed you can take the time between the checkpoints in minutes*0.139, that gives the current s/TS. Intel is a little bit faster now but not more than 3-5% according to Geophi. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
There was a lot of discussion about this situation at the time. As far as I know, Carl already applied the patch. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
The executables were compiled so SSE/SSE2 optimizations are not possible. Since the patch enables SSE/SSE2 on AMD if the executable was compiled with SSE/SSE2 switches, it should make no difference. Original tests of the coupled model were unstable when using those compiler optimizations because of the precision needed in coupling the ocean with the atmosphere. The original thread on the phpBB forum is here. Most of the thread was posted when the patch could actually help the AMD chips (slab and sulfur models), i.e. before the coupled model came out. |
Send message Joined: 29 Aug 04 Posts: 17 Credit: 376,399 RAC: 0 |
Each trickle seems to take longer and longer than the last, until at some point it reverts to a faster time. Perhaps if I can observe the rate of increase slowing a little, then we\'ll have something. But regardless any change will take at least a few days to evidence itself, so I\'ll have to look at my trickles (much) later. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
So, geophi, out of interest: Those \"tricks\" for AMD boxes only help with slab and sulphur models? Or is there anything that helps with coupled models as well? Got an AMD here myself, and although I\'m really content with its CPDN performance I guess a little extra boost couldn\'t hurt ;-) |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
So, geophi, out of interest: Those \"tricks\" for AMD boxes only help with slab and sulphur models? Or is there anything that helps with coupled models as well? Got an AMD here myself, and although I\'m really content with its CPDN performance I guess a little extra boost couldn\'t hurt ;-) No tricks possible as far as I know. Even in the end for sulphur/slab, the latest version of the Intel compiler didn\'t hurt AMD much if at all. Looking at your sec/TS, I would say you should be doing closer to 1.90-1.95 for your system, IF your memory is running at 1T command rate. You might want to check if it is. Command rate can have various names in BIOS depending on the motherboard. Run Prime95 and/or memtest86+ for several hours to ensure it is stable at 1T if you change it. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
No experience with overclocking and stuff here, but I\'ll have a look and see if it is worth trying in my case. Thanks for your advice; I\'m surprised that you think my system could be that much faster. Anything else that could be responsible for the slowdown? I don\'t use graphics a lot, just for a quick look, maybe ten minutes a day or less... so that can\'t be it... won\'t be \"write to disk\" either that slows it down because I\'ve got a fast SATA-disk and benchmarked very high in the PC Mark \"HDD score\"... But I switched \"leave task in memory\" to \"off\"... couldn\'t help it really ;-) after 2 years with only a slow \"Office\" Laptop with tons of RAM issues (meaning I always had too little) I didn\'t want to risk stuffing my memory too full, but I\'ve found out CPDN (when it\'s running) doesn\'t seem to hurt my performance at all even when I\'m gaming or heavily multitasking, so if you think it makes a huge performance difference I would try switching that option. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
My AMD PC started off as an Athlon 64 3000, which was getting 1.7s/ts after I overclocked it by 40% (1.8GHz to 2.45GHz), before overclocking it would have been something like 2.4 s/ts. I later replaced the single core chip with a dual core, and now get something like 2 x 1.8s/ts with the same clock speed etc. The most important thing with this project is stability rather than speed - it\'s better to have a stable model which runs slighly slower than a faster PC which keeps crashing, so the stability testing suggested is very important (run Prime95 for at least 24 hours). What I did was to get a stable overclock, and then drop it by 1% or so to make it more stable. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
No experience with overclocking and stuff here, but I\'ll have a look and see if it is worth trying in my case. Thanks for your advice; I\'m surprised that you think my system could be that much faster. I\'m not really suggesting overclocking. This is just changing one memory timing (if that is indeed set at 2T now). Some motherboards default to the most lax timings possible, even though the RAM and motherboard can handle better timings. It\'s worth a shot. You can download Everest 2.20 and check the memory timings by going to Motherboard -> Chipset -> Northbridge -> AMD Hammer and checking out the Command Rate setting. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Thanks, I\'ll check this out when I\'m a bit more awake (it\'s 1:30 AM here). What about the project settings, do you think \"leave in memory\" is worth trying aswell? I recently read from a few people that it\'s important to have this turned on, but is that due to stability concerns (absolutely no problems there in about a year of BOINC, all with this option turned off) or is it about performance? |
Send message Joined: 5 Feb 05 Posts: 465 Credit: 1,914,189 RAC: 0 |
Thanks, I\'ll check this out when I\'m a bit more awake (it\'s 1:30 AM here). What about the project settings, do you think \"leave in memory\" is worth trying aswell? I recently read from a few people that it\'s important to have this turned on, but is that due to stability concerns (absolutely no problems there in about a year of BOINC, all with this option turned off) or is it about performance? It\'s about how much you want to redo. CPDN had a save point less often than smaller projects, and if you remove from memory, it always has to restart from the last checkpoint, and if you were just seconds from the next checkpoint, all that time is double run. If you leave in memory, it remembers where it is. This only makes a difference if you are doing multiple projects. This also does not matter if you are shutting down BOINC for whatever reason. |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Well, I\'m crunching two projects on this box atm, so I often get hourly switches between projects. I\'ll try changing my settings and see if it hurts my PC\'s performance a lot and how much of an advantage it is with CPDN... |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
It probably makes around 10% difference on average, but it\'ll depend very much on the speed of your PC (i.e., after an hour of running, how far is it through the current checkpoint? it may be that it\'s always just done one, or alternatively always just about to do one). I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 13 Oct 06 Posts: 60 Credit: 7,893 RAC: 0 |
Haven\'t checked that out (and anyway, as CPDN has a higher percentage, it also happens that it crunches for 2 or 3 hours in a row, which would change that value again, apart from my impression that the speed varies quite a bit)... but 10% sounds quite a lot to me, that would be about what geophi thinks my computer lacks in speed. |
Send message Joined: 3 Mar 06 Posts: 96 Credit: 353,185 RAC: 0 |
Well, I\'m crunching two projects on this box atm, so I often get hourly switches between projects. I think the recommended BOINC version waits until the science app checkpoints before switching projects therefore project switches are not as big a concern as they used to be. Still you might get a wee bit more efficiency (and it may be a very wee bit) by adjusting the \"switch between projects time\" setting up to something like 4 hours or more, even 6 is not unreasonable. Where will the efficiency gain occur? Well, when BOINC switches to a different project it removes the current project\'s data and program from RAM and writes it to virtual memory (the swap file) if you have the \"keep in memory\" option turned on. That save to virtual memory takes time, maybe very little time maybe more but it\'s time nevertheless. You may notice it as a slight pause during a game or you may not. Either way it\'s time that could be used for crunching, gaming or whatever. And for every time it writes the RAM to virtual it has to read it from virtual back into RAM when it switches back to that project so we\'re talking 2 pauses not just 1. Whether or not those 2 wee pauses are important is a personal choice. Since it doesn\'t take much effort to reduce the number of pauses I\'ve chosen to reduce. You can adjust the \"switch between projects time\" setting either in your General settings on a BOINC project website or else in your global preferences override file if you use one. At the website the setting is called Switch between applications every and it\'s found in the Processor Usage section. In the global_prefs_override.xml file, if you use one, the setting is <cpu_scheduling_period_minutes>. The units are minutes in the override file as well as in the website settings. If you use an override file then there is no need to set it on the website. Love the override file. --- |
Send message Joined: 3 Mar 06 Posts: 96 Credit: 353,185 RAC: 0 |
Excuse this double post but I think my previous post begs the question \"What happens if I don\'t have the \"keep in memory\" option turned on? In that case, when BOINC switches projects or stops processing the current WU because you have the \"Do work while computer is in use\" set to NO and you go from inactive to active by moving the mouse or hitting a key, BOINC kicks the project data and program out of RAM but does NOT write it to virtual memory. When the project data needs to be read back into RAM it must be read from the checkpoint file which is slower than reading it from virtual memory. Likewise, the program (the science app) must be read in from the science app file which, again, is slower than reading it from virtual memory. So there you have it, a more complete answer for the more curious. --- |
©2024 cpdn.org