Thread 'Credit_per_cpu_second efficiency measure'

Author	Message
old_user677346 Send message Joined: 16 Apr 12 Posts: 6 Credit: 19,102 RAC: 0	Message 45297 - Posted: 2 Dec 2012, 10:48:55 UTC I'd like to calculate a rough but broad-based estimate of CPU performance and efficiency using CPDN data on the tens of thousands of active and recently-active hosts. This was my plan: 1) download the stats/host.xml file 2) extract total_credit, p_vendor, p_model, os_name, os_version, n_cpus, credit_per_cpu_second, m_nbytes 3) add in processor-specific data like cache size, bus speed, TDP, etc by matching p_model/vendor to some database (like wikipedia?) 4) calculate mean credit_per_cpu_second by processor model and speed (multiplied by n_cpus), weighted by total_credit 5) possibility to control by cache size, bus speed, OS, RAM 6) use TDP numbers to calculate crude measure of calculation/wattage efficiency A couple of hangups, though: a) a lot of hosts in the host.xml file list credit_per_cpu_second as 0.000000000 even though the host does have credit registered to it. I guess I'll have to throw these out - I traced a couple and found that the tasks for the hosts have disappeared, so there is no measure of cpu seconds. Is there any reason for this disappearance that might affect the statistical validity of these calculations? b) Is credit comparable across models? If not, I'll need a way to discern which model a host's credit is attributable to, and then I'll have to split the calculations by model. Any thoughts are greatly appreciated Philip ID: 45297 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944	Message 45298 - Posted: 2 Dec 2012, 16:49:45 UTC - in response to Message 45297. Another variable is operating system. I seem to remember reading that tasks run more efficiently on windows machines. I don't know how significant this is or if you accept this and assume the distribution of OS types is the same across all processor types? ID: 45298 · Reply Quote

Bonsai911 Send message Joined: 9 Sep 04 Posts: 228 Credit: 30,756,611 RAC: 3,303	Message 45299 - Posted: 2 Dec 2012, 17:48:22 UTC I guess, os depending speed of calculation also varies by model type. Some prefer Linux, the other windows. greetings from hamburg bonsai911 ID: 45299 · Reply Quote

astroWX Volunteer moderator Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0	Message 45300 - Posted: 2 Dec 2012, 18:10:50 UTC ... and there is the compiler issue: Intel CPUs fare better with the Intel compiler. Carl, former lead developer for CPDN, ran such a CPU comparison several years ago. Does anyone have a copy -- or recall any conclusions? (My copy was in Linux -- and I've since sent Linux to the bit-bucket.) "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. ID: 45300 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 45301 - Posted: 2 Dec 2012, 21:15:20 UTC Last modified: 2 Dec 2012, 21:16:51 UTC ... found that the tasks for the hosts have disappeared ... This project is different to most others, in that credit is based on the return of "trickle" data, and not given as a lump sum on completion of a model. However, it was found soon after converting from a pre-BOINC system to BOINC, that BOINC had difficulty with this, and on occasion would also allocate credit to some work on completion as well. So crediting was changed to a script that ran through all work at short intervals, recalculating credit as it went. But as the returned results rapidly increased, this started to take up too much time. So the script was only run twice a day, and then once per day. But even this was taking up hours of server time, and it was decided to make a cut off point, calculate the credit up to then, store these values, and archive the results elsewhere. Then only the remaining results would be rescanned each day, and the stored credit values added to the credits from the daily scans to produce a total credit. This is the reason for the missing results that you mention, and if you look at your Account page, you'll see 2 lines not far from the top that say Archived. Is credit comparable across models? Yes and no. Credit per trickle is different for each type of model, and depends on the amount of time taken by that model type to complete a given interval. The Coupled Ocean models for instance, are more fpu intense than the Regional models. But an attempt is made to make the "credits per amount of work' comparable across all models. Backups: Here ID: 45301 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 45302 - Posted: 2 Dec 2012, 21:33:10 UTC And another thing ... The work in this project isn't intended to run to completion at all times. Each model will only run for as long as the many variables produce a stable climate system. If the starting values are such that a model becomes unstable, then the work will be terminated. Which is the reason for trickles - small amounts of data gets returned via them, and the researchers can tell roughly where the model crashed by where data stops getting sent back. So listed credit may be for a different number of trickles/way-through-the-model. Backups: Here ID: 45302 · Reply Quote