Message boards : Number crunching : Where is my bottleneck?
Message board moderation
Author | Message |
---|---|
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
I think that there is something mucking up the efficient running of my model on my machine. To start with, my trickle results show a consistent \"Avg sec/TS\" of 4.6. However, I see that \"2.5 Seconds/Timestep computational average\" is shown as a benchmark in the \"Application Preferences\" secion of the cpdn preferences. I have scoured the message boards and tips and guidelines and everything else and I can\'t account for my variance. 1) I\'m NOT running the screensaver except when I want to check on the progress of the model, then I close the screensaver. 2) The model has 50% of my machine\'s CPU resources, and I have a 2.8 GHZ dual-core, so it generally gets 100% of one of the cores, 24/7 (occasionally BOINC decides to run SETI or World Grid for an hour on CPDN\'s core for some reason, but it is not often) 3) I have 2GB of RAM and a 2 GB swapfile, and the swapfile is NOT fragmented. 4) I have a Seagate 160 GB \"Barracuda\" hard drive. 5) The \"hadcm3trans_5.42.windows.intelx86.exe\" and \"hadcm3transum_5.42.windows.intelx86.exe\" tasks are set to NORMAL priority. 6) I suspend my projects, exit BOINC, back up the BOINC directory, defrag the hard drive, and restart my machine once per day. 7) CPU temperatures are normal, maybe on the cool side: 116 F. currently with a room temp of 67 F. 8) The relevant BOINC Settings: a) Do work while computer is in use?: YES b) Leave applications in memory while suspended? : YES c) On multiprocessors, use at most: 2 d) Use at most: 100% 9) Most importantly, Windows Task Manager shows the \"hadcm3transum_5.42.windows.intelx86.exe\" task at a solid 50% nearly 100% of the time. So, if the \"2.5 CPU seconds per timestamp\" benchmark is reasonable, can anyone see where I may be losing efficiency to the tune of nearly 50%? Thanks, Ed |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
I think that there is something mucking up the efficient running of my model on my machine. Ed, Estimated WU Completion Time @ 2.5 Seconds/Timestep computational average That is an informational statement meant as relative guidance to help inexperienced participants make a selection among Model types. It has nothing to do with your machine or its performance. Believe your Trickle times (which are averaged for the entire period of the Run). [Edited for typo.] "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
Hi AstroWX, Thanks. I have no reason to believe that my trickle times are inaccurate, but am now wondering what other users show for their \"Avg sec/TS\". Maybe this will shed some light. Ed |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
I have a 2 gig dual processor system running at 2.13 GHz with a pair of CPDN models running at 1.94 and 1.88 s/TS respectively. |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
I have a 2 gig dual processor system running at 2.13 GHz with a pair of CPDN models running at 1.94 and 1.88 s/TS respectively. Hi Lockleys, That is very good information for me - it looks like there is definitely an issue here. A few questions: 1) How far into your model are you percentage-wise, and 2) If you look at the list of your trickle results for this model, do you see that your s/TS values are staying roughly the same or are they changing (increasing/decreasing) as your model progresses? Thanks very much, Ed |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Ed Anyone can look at the \'public\' info for most people by clicking on their name to the left of a post. Then click on \"View\" to the right of computers, and then on the number under \"Results\". After this, it\'s a matter of \'pick a model\' under \"Result ID\", and look at the info. For some people, there are dozens of models, so it can take a while. The only time this can\'t be done, is when people choose to hide their computers. Not much point in looking at mine, as I\'ve been running climate models on other sites for a couple of years, and won\'t be back here until I can get some more computers to spread around. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
Ed, It looks to me like you\'re doing everything right. I think a clue as to what\'s happening is in the task name (i.e. hadcm3transum_5.42.windows.intelx86.exe): the version 5.42 contains some code changes that make this version much slower than other versions. I suspect that you won\'t be able to speed it up; nor will comparisons with other participants running other versions help very much. So far as I\'m aware, the advice is to stick with the 5.42 models, even though they do take rather longer. Iain |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
Thanks, all. This is great information! Therefore, I\'ll relax and go with the conclusion that everything is ok. Ed |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Looking at the current versions of stuff on the site, 5.44 is available, but as Iain says, it\'s probably best to concentrate on your existing model. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 29 Mar 06 Posts: 8 Credit: 2,793,692 RAC: 0 |
I think that there is something mucking up the efficient running of my model on my machine. Ed - Your PC seems just fine. Good specs, benchmarks are all OK etc. I think it\'s simply that you are only allowing CPDN 50% of the CPU respources, so your T/S is about double what it would be otherwise. If the other 50% is doing nothing, you may as well let CPDN use it. The programme is so efficient at running on low priority you won\'t notice it at all in the background. Cheers Anthony |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Well, I\'ll be the odd man out. Being only 8 years/trickles in on your model, I would abort it. You\'ll be able to finish about two 5.44 models in the same time it would take you to finish your current one. AntB - The reason his sec/TS is so high is because he is running a 5.42 model, which had serious speed problems. He has a Pentium D which is dual core. That is why the task manager shows 50%, which is what it should show if he is only running one model. |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
Well, I\'ll be the odd man out. Being only 8 years/trickles in on your model, I would abort it. You\'ll be able to finish about two 5.44 models in the same time it would take you to finish your current one. AntB - You are correct - I have a dual core, and one core (50%) is allocated to the CPDN model, and the other core (50%) is allocated to SETI and World Grid. 1) If I were to abort this CPDN model, what would be the impact on the overall project (i.e., would it be a complete waste of effort and resources), and 2)would I receive a faster 5.44 model in its place or another slow 5.42 model? Thanks, Ed |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
You\'d receive either a 5.44 coupled model, or a 5.06 slab model. If you want to receive a specific model, set it up in your preferences first (i.e., Your Account, CPDN preferences, Edit, tick HadCM3 or HadSM3, save). No problem from the project\'s viewpoint since the model\'s progress is uploaded as it goes. However it\'s best to abort after a decade or 40-year upload if you can (more information gets uploaded during those years). I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
One additional question - when you abort a model, does some automated process clean up the work files/directories or do I need to do that manually? I\'ve got about 204mb of disk space being used by CPDN right now. Thanks, Ed |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
|
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
Great information. Thanks, all. Ed |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
I waited until my model reached the decade boundary and waied until the trickle message and zip file completed uploading. I then backed up the BOINC directory and aborted my very slow 5.44 model. To clean up the mess left behind, it occured to me that detaching from the project might do the job. Detaching DID indeed delete all the CPDN directories and work files. Note that you would use this approach only if you were running ONE model, not multiples - it would kill your other model(s). I reattached and received not one, but TWO 5.44 models. It appears that both of them are running at least 2x faster than the aborted 5.42 model. Rather than abort the extra one, I\'ll make room and run both, since I\'ve effectively doubled my throughput. It would be nice if there was a straight-forward way to control how many models you receive. I think what happened is that my other two projects were suspended at the time of attachment to CPDN, and CPDN saw that as a go-ahead to send me two models. Ed |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
Correction, I meant to say, \"...aborted my very slow 5.42 model.\" |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
... and CPDN saw that as a go-ahead to send me two models. No. BOINC, (the housekeeper), saw that both of your processors were idle, and asked for some work for both. To only get one model, you need to change the number of processors in General Preferences on your account page. Except that it\'s now too late to do that, as BOINC will then only use one of your processors, and alternate the 2 models on the one processor. First thing to do now, is to set the project to \"No new tasks\" in the Projects tab, to stop any more climate models being downloaded if one fails while you\'re not watching. Backups: Here |
Send message Joined: 22 May 07 Posts: 35 Credit: 1,065,741 RAC: 0 |
Les, Thanks for clarifying that it was BOINC that did it. I normally keep CPDN set to \"No new tasks\" since I accidentally got a new model a while back when I momentarily suspended SETI. Ed |
©2024 cpdn.org