Thread 'Where is my bottleneck?'

Author	Message
old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29727 - Posted: 26 Jul 2007, 21:08:26 UTC I think that there is something mucking up the efficient running of my model on my machine. To start with, my trickle results show a consistent \"Avg sec/TS\" of 4.6. However, I see that \"2.5 Seconds/Timestep computational average\" is shown as a benchmark in the \"Application Preferences\" secion of the cpdn preferences. I have scoured the message boards and tips and guidelines and everything else and I can\'t account for my variance. 1) I\'m NOT running the screensaver except when I want to check on the progress of the model, then I close the screensaver. 2) The model has 50% of my machine\'s CPU resources, and I have a 2.8 GHZ dual-core, so it generally gets 100% of one of the cores, 24/7 (occasionally BOINC decides to run SETI or World Grid for an hour on CPDN\'s core for some reason, but it is not often) 3) I have 2GB of RAM and a 2 GB swapfile, and the swapfile is NOT fragmented. 4) I have a Seagate 160 GB \"Barracuda\" hard drive. 5) The \"hadcm3trans_5.42.windows.intelx86.exe\" and \"hadcm3transum_5.42.windows.intelx86.exe\" tasks are set to NORMAL priority. 6) I suspend my projects, exit BOINC, back up the BOINC directory, defrag the hard drive, and restart my machine once per day. 7) CPU temperatures are normal, maybe on the cool side: 116 F. currently with a room temp of 67 F. 8) The relevant BOINC Settings: a) Do work while computer is in use?: YES b) Leave applications in memory while suspended? : YES c) On multiprocessors, use at most: 2 d) Use at most: 100% 9) Most importantly, Windows Task Manager shows the \"hadcm3transum_5.42.windows.intelx86.exe\" task at a solid 50% nearly 100% of the time. So, if the \"2.5 CPU seconds per timestamp\" benchmark is reasonable, can anyone see where I may be losing efficiency to the tune of nearly 50%? Thanks, Ed ID: 29727 · Reply Quote

astroWX Volunteer moderator Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0	Message 29728 - Posted: 26 Jul 2007, 21:18:43 UTC - in response to Message 29727. Last modified: 26 Jul 2007, 21:22:26 UTC I think that there is something mucking up the efficient running of my model on my machine. To start with, my trickle results show a consistent \"Avg sec/TS\" of 4.6. However, I see that \"2.5 Seconds/Timestep computational average\" is shown as a benchmark in the \"Application Preferences\" secion of the cpdn preferences. Ed, Estimated WU Completion Time @ 2.5 Seconds/Timestep computational average That is an informational statement meant as relative guidance to help inexperienced participants make a selection among Model types. It has nothing to do with your machine or its performance. Believe your Trickle times (which are averaged for the entire period of the Run). [Edited for typo.] "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. ID: 29728 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29729 - Posted: 26 Jul 2007, 21:39:17 UTC Hi AstroWX, Thanks. I have no reason to believe that my trickle times are inaccurate, but am now wondering what other users show for their \"Avg sec/TS\". Maybe this will shed some light. Ed ID: 29729 · Reply Quote

Lockleys Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0	Message 29730 - Posted: 26 Jul 2007, 21:52:10 UTC I have a 2 gig dual processor system running at 2.13 GHz with a pair of CPDN models running at 1.94 and 1.88 s/TS respectively. ID: 29730 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29731 - Posted: 26 Jul 2007, 22:11:46 UTC - in response to Message 29730. I have a 2 gig dual processor system running at 2.13 GHz with a pair of CPDN models running at 1.94 and 1.88 s/TS respectively. Hi Lockleys, That is very good information for me - it looks like there is definitely an issue here. A few questions: 1) How far into your model are you percentage-wise, and 2) If you look at the list of your trickle results for this model, do you see that your s/TS values are staying roughly the same or are they changing (increasing/decreasing) as your model progresses? Thanks very much, Ed ID: 29731 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 29732 - Posted: 26 Jul 2007, 23:10:34 UTC Last modified: 26 Jul 2007, 23:11:14 UTC Ed Anyone can look at the \'public\' info for most people by clicking on their name to the left of a post. Then click on \"View\" to the right of computers, and then on the number under \"Results\". After this, it\'s a matter of \'pick a model\' under \"Result ID\", and look at the info. For some people, there are dozens of models, so it can take a while. The only time this can\'t be done, is when people choose to hide their computers. Not much point in looking at mine, as I\'ve been running climate models on other sites for a couple of years, and won\'t be back here until I can get some more computers to spread around. ID: 29732 · Reply Quote

Iain Inglis Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317	Message 29733 - Posted: 26 Jul 2007, 23:21:51 UTC Ed, It looks to me like you\'re doing everything right. I think a clue as to what\'s happening is in the task name (i.e. hadcm3transum_5.42.windows.intelx86.exe): the version 5.42 contains some code changes that make this version much slower than other versions. I suspect that you won\'t be able to speed it up; nor will comparisons with other participants running other versions help very much. So far as I\'m aware, the advice is to stick with the 5.42 models, even though they do take rather longer. Iain ID: 29733 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29734 - Posted: 26 Jul 2007, 23:31:45 UTC Thanks, all. This is great information! Therefore, I\'ll relax and go with the conclusion that everything is ok. Ed ID: 29734 · Reply Quote

MikeMarsUK Volunteer moderator Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0	Message 29753 - Posted: 28 Jul 2007, 19:32:54 UTC Looking at the current versions of stuff on the site, 5.44 is available, but as Iain says, it\'s probably best to concentrate on your existing model. I'm a volunteer and my views are my own. News and Announcements and FAQ ID: 29753 · Reply Quote

Ant B Send message Joined: 29 Mar 06 Posts: 8 Credit: 2,793,692 RAC: 0	Message 29754 - Posted: 28 Jul 2007, 22:04:12 UTC - in response to Message 29727. I think that there is something mucking up the efficient running of my model on my machine. To start with, my trickle results show a consistent \"Avg sec/TS\" of 4.6. However, I see that \"2.5 Seconds/Timestep computational average\" is shown as a benchmark in the \"Application Preferences\" secion of the cpdn preferences. So, if the \"2.5 CPU seconds per timestamp\" benchmark is reasonable, can anyone see where I may be losing efficiency to the tune of nearly 50%? Thanks, Ed Ed - Your PC seems just fine. Good specs, benchmarks are all OK etc. I think it\'s simply that you are only allowing CPDN 50% of the CPU respources, so your T/S is about double what it would be otherwise. If the other 50% is doing nothing, you may as well let CPDN use it. The programme is so efficient at running on low priority you won\'t notice it at all in the background. Cheers Anthony ID: 29754 · Reply Quote

geophi Volunteer moderator Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275	Message 29755 - Posted: 28 Jul 2007, 23:35:49 UTC Last modified: 28 Jul 2007, 23:40:02 UTC Well, I\'ll be the odd man out. Being only 8 years/trickles in on your model, I would abort it. You\'ll be able to finish about two 5.44 models in the same time it would take you to finish your current one. AntB - The reason his sec/TS is so high is because he is running a 5.42 model, which had serious speed problems. He has a Pentium D which is dual core. That is why the task manager shows 50%, which is what it should show if he is only running one model. ID: 29755 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29760 - Posted: 29 Jul 2007, 19:45:49 UTC - in response to Message 29755. Well, I\'ll be the odd man out. Being only 8 years/trickles in on your model, I would abort it. You\'ll be able to finish about two 5.44 models in the same time it would take you to finish your current one. AntB - The reason his sec/TS is so high is because he is running a 5.42 model, which had serious speed problems. He has a Pentium D which is dual core. That is why the task manager shows 50%, which is what it should show if he is only running one model. AntB - You are correct - I have a dual core, and one core (50%) is allocated to the CPDN model, and the other core (50%) is allocated to SETI and World Grid. 1) If I were to abort this CPDN model, what would be the impact on the overall project (i.e., would it be a complete waste of effort and resources), and 2)would I receive a faster 5.44 model in its place or another slow 5.42 model? Thanks, Ed ID: 29760 · Reply Quote

MikeMarsUK Volunteer moderator Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0	Message 29762 - Posted: 29 Jul 2007, 21:00:35 UTC You\'d receive either a 5.44 coupled model, or a 5.06 slab model. If you want to receive a specific model, set it up in your preferences first (i.e., Your Account, CPDN preferences, Edit, tick HadCM3 or HadSM3, save). No problem from the project\'s viewpoint since the model\'s progress is uploaded as it goes. However it\'s best to abort after a decade or 40-year upload if you can (more information gets uploaded during those years). I'm a volunteer and my views are my own. News and Announcements and FAQ ID: 29762 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29764 - Posted: 29 Jul 2007, 23:48:50 UTC - in response to Message 29762. You\'d receive either a 5.44 coupled model, or a 5.06 slab model. If you want to receive a specific model, set it up in your preferences first (i.e., Your Account, CPDN preferences, Edit, tick HadCM3 or HadSM3, save). No problem from the project\'s viewpoint since the model\'s progress is uploaded as it goes. However it\'s best to abort after a decade or 40-year upload if you can (more information gets uploaded during those years). One additional question - when you abort a model, does some automated process clean up the work files/directories or do I need to do that manually? I\'ve got about 204mb of disk space being used by CPDN right now. Thanks, Ed ID: 29764 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 29765 - Posted: 30 Jul 2007, 0:21:43 UTC It\'s manual. Automatic cleanup is only after fully completing a model. Backups: Here ID: 29765 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29766 - Posted: 30 Jul 2007, 1:06:15 UTC - in response to Message 29765. It\'s manual. Automatic cleanup is only after fully completing a model. Great information. Thanks, all. Ed ID: 29766 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29778 - Posted: 31 Jul 2007, 3:12:55 UTC I waited until my model reached the decade boundary and waied until the trickle message and zip file completed uploading. I then backed up the BOINC directory and aborted my very slow 5.44 model. To clean up the mess left behind, it occured to me that detaching from the project might do the job. Detaching DID indeed delete all the CPDN directories and work files. Note that you would use this approach only if you were running ONE model, not multiples - it would kill your other model(s). I reattached and received not one, but TWO 5.44 models. It appears that both of them are running at least 2x faster than the aborted 5.42 model. Rather than abort the extra one, I\'ll make room and run both, since I\'ve effectively doubled my throughput. It would be nice if there was a straight-forward way to control how many models you receive. I think what happened is that my other two projects were suspended at the time of attachment to CPDN, and CPDN saw that as a go-ahead to send me two models. Ed ID: 29778 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29779 - Posted: 31 Jul 2007, 3:14:19 UTC Correction, I meant to say, \"...aborted my very slow 5.42 model.\" ID: 29779 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 29780 - Posted: 31 Jul 2007, 5:06:57 UTC ... and CPDN saw that as a go-ahead to send me two models. No. BOINC, (the housekeeper), saw that both of your processors were idle, and asked for some work for both. To only get one model, you need to change the number of processors in General Preferences on your account page. Except that it\'s now too late to do that, as BOINC will then only use one of your processors, and alternate the 2 models on the one processor. First thing to do now, is to set the project to \"No new tasks\" in the Projects tab, to stop any more climate models being downloaded if one fails while you\'re not watching. Backups: Here ID: 29780 · Reply Quote

old_user452941 Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0	Message 29781 - Posted: 31 Jul 2007, 5:19:30 UTC - in response to Message 29780. ... and CPDN saw that as a go-ahead to send me two models. No. BOINC, (the housekeeper), saw that both of your processors were idle, and asked for some work for both. To only get one model, you need to change the number of processors in General Preferences on your account page. Except that it\'s now too late to do that, as BOINC will then only use one of your processors, and alternate the 2 models on the one processor. First thing to do now, is to set the project to \"No new tasks\" in the Projects tab, to stop any more climate models being downloaded if one fails while you\'re not watching. Les, Thanks for clarifying that it was BOINC that did it. I normally keep CPDN set to \"No new tasks\" since I accidentally got a new model a while back when I momentarily suspended SETI. Ed ID: 29781 · Reply Quote