climateprediction.net (CPDN) home page
Thread 'Where is my bottleneck?'

Thread 'Where is my bottleneck?'

Message boards : Number crunching : Where is my bottleneck?
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29727 - Posted: 26 Jul 2007, 21:08:26 UTC

I think that there is something mucking up the efficient running of my model on my machine.

To start with, my trickle results show a consistent \"Avg sec/TS\" of 4.6. However, I see that \"2.5 Seconds/Timestep computational average\" is shown as a benchmark in the \"Application Preferences\" secion of the cpdn preferences.

I have scoured the message boards and tips and guidelines and everything else and I can\'t account for my variance.

1) I\'m NOT running the screensaver except when I want to check on the progress of the model, then I close the screensaver.
2) The model has 50% of my machine\'s CPU resources, and I have a 2.8 GHZ dual-core, so it generally gets 100% of one of the cores, 24/7 (occasionally BOINC decides to run SETI or World Grid for an hour on CPDN\'s core for some reason, but it is not often)
3) I have 2GB of RAM and a 2 GB swapfile, and the swapfile is NOT fragmented.
4) I have a Seagate 160 GB \"Barracuda\" hard drive.
5) The \"hadcm3trans_5.42.windows.intelx86.exe\" and \"hadcm3transum_5.42.windows.intelx86.exe\" tasks are set to NORMAL priority.
6) I suspend my projects, exit BOINC, back up the BOINC directory, defrag the hard drive, and restart my machine once per day.
7) CPU temperatures are normal, maybe on the cool side: 116 F. currently with a room temp of 67 F.
8) The relevant BOINC Settings:
a) Do work while computer is in use?: YES
b) Leave applications in memory while suspended? : YES
c) On multiprocessors, use at most: 2
d) Use at most: 100%
9) Most importantly, Windows Task Manager shows the \"hadcm3transum_5.42.windows.intelx86.exe\" task at a solid 50% nearly 100% of the time.

So, if the \"2.5 CPU seconds per timestamp\" benchmark is reasonable, can anyone see where I may be losing efficiency to the tune of nearly 50%?

Thanks,
Ed
ID: 29727 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 29728 - Posted: 26 Jul 2007, 21:18:43 UTC - in response to Message 29727.  
Last modified: 26 Jul 2007, 21:22:26 UTC

I think that there is something mucking up the efficient running of my model on my machine.

To start with, my trickle results show a consistent \"Avg sec/TS\" of 4.6. However, I see that \"2.5 Seconds/Timestep computational average\" is shown as a benchmark in the \"Application Preferences\" secion of the cpdn preferences.

Ed,

Estimated WU Completion Time @ 2.5 Seconds/Timestep computational average

That is an informational statement meant as relative guidance to help inexperienced participants make a selection among Model types. It has nothing to do with your machine or its performance. Believe your Trickle times (which are averaged for the entire period of the Run).

[Edited for typo.]
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 29728 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29729 - Posted: 26 Jul 2007, 21:39:17 UTC

Hi AstroWX,

Thanks. I have no reason to believe that my trickle times are inaccurate, but am now wondering what other users show for their \"Avg sec/TS\". Maybe this will shed some light.

Ed
ID: 29729 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 29730 - Posted: 26 Jul 2007, 21:52:10 UTC

I have a 2 gig dual processor system running at 2.13 GHz with a pair of CPDN models running at 1.94 and 1.88 s/TS respectively.
ID: 29730 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29731 - Posted: 26 Jul 2007, 22:11:46 UTC - in response to Message 29730.  

I have a 2 gig dual processor system running at 2.13 GHz with a pair of CPDN models running at 1.94 and 1.88 s/TS respectively.


Hi Lockleys,

That is very good information for me - it looks like there is definitely an issue here. A few questions:

1) How far into your model are you percentage-wise, and
2) If you look at the list of your trickle results for this model, do you see that your s/TS values are staying roughly the same or are they changing (increasing/decreasing) as your model progresses?

Thanks very much,
Ed
ID: 29731 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 29732 - Posted: 26 Jul 2007, 23:10:34 UTC
Last modified: 26 Jul 2007, 23:11:14 UTC

Ed

Anyone can look at the \'public\' info for most people by clicking on their name to the left of a post.
Then click on \"View\" to the right of computers, and then on the number under \"Results\". After this, it\'s a matter of \'pick a model\' under \"Result ID\", and look at the info. For some people, there are dozens of models, so it can take a while.
The only time this can\'t be done, is when people choose to hide their computers.

Not much point in looking at mine, as I\'ve been running climate models on other sites for a couple of years, and won\'t be back here until I can get some more computers to spread around.

ID: 29732 · Report as offensive     Reply Quote
ProfileIain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 29733 - Posted: 26 Jul 2007, 23:21:51 UTC

Ed,

It looks to me like you\'re doing everything right. I think a clue as to what\'s happening is in the task name (i.e. hadcm3transum_5.42.windows.intelx86.exe): the version 5.42 contains some code changes that make this version much slower than other versions. I suspect that you won\'t be able to speed it up; nor will comparisons with other participants running other versions help very much.

So far as I\'m aware, the advice is to stick with the 5.42 models, even though they do take rather longer.

Iain
ID: 29733 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29734 - Posted: 26 Jul 2007, 23:31:45 UTC

Thanks, all.

This is great information! Therefore, I\'ll relax and go with the conclusion that everything is ok.

Ed
ID: 29734 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 29753 - Posted: 28 Jul 2007, 19:32:54 UTC


Looking at the current versions of stuff on the site, 5.44 is available, but as Iain says, it\'s probably best to concentrate on your existing model.

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 29753 · Report as offensive     Reply Quote
Ant B

Send message
Joined: 29 Mar 06
Posts: 8
Credit: 2,793,692
RAC: 0
Message 29754 - Posted: 28 Jul 2007, 22:04:12 UTC - in response to Message 29727.  

I think that there is something mucking up the efficient running of my model on my machine.

To start with, my trickle results show a consistent \"Avg sec/TS\" of 4.6. However, I see that \"2.5 Seconds/Timestep computational average\" is shown as a benchmark in the \"Application Preferences\" secion of the cpdn preferences.

So, if the \"2.5 CPU seconds per timestamp\" benchmark is reasonable, can anyone see where I may be losing efficiency to the tune of nearly 50%?

Thanks,
Ed


Ed - Your PC seems just fine. Good specs, benchmarks are all OK etc. I think it\'s simply that you are only allowing CPDN 50% of the CPU respources, so your T/S is about double what it would be otherwise. If the other 50% is doing nothing, you may as well let CPDN use it. The programme is so efficient at running on low priority you won\'t notice it at all in the background.

Cheers

Anthony
ID: 29754 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2186
Credit: 64,822,615
RAC: 5,275
Message 29755 - Posted: 28 Jul 2007, 23:35:49 UTC
Last modified: 28 Jul 2007, 23:40:02 UTC

Well, I\'ll be the odd man out. Being only 8 years/trickles in on your model, I would abort it. You\'ll be able to finish about two 5.44 models in the same time it would take you to finish your current one.

AntB - The reason his sec/TS is so high is because he is running a 5.42 model, which had serious speed problems. He has a Pentium D which is dual core. That is why the task manager shows 50%, which is what it should show if he is only running one model.
ID: 29755 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29760 - Posted: 29 Jul 2007, 19:45:49 UTC - in response to Message 29755.  

Well, I\'ll be the odd man out. Being only 8 years/trickles in on your model, I would abort it. You\'ll be able to finish about two 5.44 models in the same time it would take you to finish your current one.

AntB - The reason his sec/TS is so high is because he is running a 5.42 model, which had serious speed problems. He has a Pentium D which is dual core. That is why the task manager shows 50%, which is what it should show if he is only running one model.


AntB - You are correct - I have a dual core, and one core (50%) is allocated to the CPDN model, and the other core (50%) is allocated to SETI and World Grid.

1) If I were to abort this CPDN model, what would be the impact on the overall project (i.e., would it be a complete waste of effort and resources), and 2)would I receive a faster 5.44 model in its place or another slow 5.42 model?

Thanks,
Ed
ID: 29760 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 29762 - Posted: 29 Jul 2007, 21:00:35 UTC


You\'d receive either a 5.44 coupled model, or a 5.06 slab model. If you want to receive a specific model, set it up in your preferences first (i.e., Your Account, CPDN preferences, Edit, tick HadCM3 or HadSM3, save).

No problem from the project\'s viewpoint since the model\'s progress is uploaded as it goes. However it\'s best to abort after a decade or 40-year upload if you can (more information gets uploaded during those years).
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 29762 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29764 - Posted: 29 Jul 2007, 23:48:50 UTC - in response to Message 29762.  


You\'d receive either a 5.44 coupled model, or a 5.06 slab model. If you want to receive a specific model, set it up in your preferences first (i.e., Your Account, CPDN preferences, Edit, tick HadCM3 or HadSM3, save).

No problem from the project\'s viewpoint since the model\'s progress is uploaded as it goes. However it\'s best to abort after a decade or 40-year upload if you can (more information gets uploaded during those years).


One additional question - when you abort a model, does some automated process clean up the work files/directories or do I need to do that manually? I\'ve got about 204mb of disk space being used by CPDN right now.

Thanks,
Ed
ID: 29764 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 29765 - Posted: 30 Jul 2007, 0:21:43 UTC


It\'s manual.
Automatic cleanup is only after fully completing a model.


Backups: Here
ID: 29765 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29766 - Posted: 30 Jul 2007, 1:06:15 UTC - in response to Message 29765.  


It\'s manual.
Automatic cleanup is only after fully completing a model.



Great information. Thanks, all.

Ed
ID: 29766 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29778 - Posted: 31 Jul 2007, 3:12:55 UTC

I waited until my model reached the decade boundary and waied until the trickle message and zip file completed uploading. I then backed up the BOINC directory and aborted my very slow 5.44 model.

To clean up the mess left behind, it occured to me that detaching from the project might do the job. Detaching DID indeed delete all the CPDN directories and work files. Note that you would use this approach only if you were running ONE model, not multiples - it would kill your other model(s).

I reattached and received not one, but TWO 5.44 models. It appears that both of them are running at least 2x faster than the aborted 5.42 model. Rather than abort the extra one, I\'ll make room and run both, since I\'ve effectively doubled my throughput.

It would be nice if there was a straight-forward way to control how many models you receive. I think what happened is that my other two projects were suspended at the time of attachment to CPDN, and CPDN saw that as a go-ahead to send me two models.

Ed
ID: 29778 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29779 - Posted: 31 Jul 2007, 3:14:19 UTC

Correction, I meant to say,

\"...aborted my very slow 5.42 model.\"
ID: 29779 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 29780 - Posted: 31 Jul 2007, 5:06:57 UTC


... and CPDN saw that as a go-ahead to send me two models.

No. BOINC, (the housekeeper), saw that both of your processors were idle, and asked for some work for both.

To only get one model, you need to change the number of processors in General Preferences on your account page. Except that it\'s now too late to do that, as BOINC will then only use one of your processors, and alternate the 2 models on the one processor.

First thing to do now, is to set the project to \"No new tasks\" in the Projects tab, to stop any more climate models being downloaded if one fails while you\'re not watching.


Backups: Here
ID: 29780 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 29781 - Posted: 31 Jul 2007, 5:19:30 UTC - in response to Message 29780.  


... and CPDN saw that as a go-ahead to send me two models.

No. BOINC, (the housekeeper), saw that both of your processors were idle, and asked for some work for both.

To only get one model, you need to change the number of processors in General Preferences on your account page. Except that it\'s now too late to do that, as BOINC will then only use one of your processors, and alternate the 2 models on the one processor.

First thing to do now, is to set the project to \"No new tasks\" in the Projects tab, to stop any more climate models being downloaded if one fails while you\'re not watching.



Les,

Thanks for clarifying that it was BOINC that did it.

I normally keep CPDN set to \"No new tasks\" since I accidentally got a new model a while back when I momentarily suspended SETI.

Ed
ID: 29781 · Report as offensive     Reply Quote

Message boards : Number crunching : Where is my bottleneck?

©2024 cpdn.org