[eigen] Re: Benchmark for blocking sizes - take 2

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]




2015-02-20 17:34 GMT-05:00 Benoit Jacob <jacob.benoit.1@xxxxxxxxxx>:
What's next:
The new data format doesn't contain analysis, just raw timings. Once we have data from several different machines, we can start to generate tables of blocking parameters that are "least bad" across multiple machines. Of course, users who want absolutely optimal perf on their machine will have to generate a table for it, but Eigen should default to parameters that minimize the efficiency across all machines. So the next step is to generate such a table of parameters by aggregating data from multiple logs.

An analysis program is now in bench/analyze-blocking-sizes.cpp. It takes a list of logs and tries to find good partitionings.

Example on current log-blocking-sizes/v2 data:

Partition into 4 subsets for 91.2% efficiency
  Subset 0, efficiency 91.2%:
    tmp2/sandy-bridge-xeon-e5-1650.txt
    tmp2/sandy-bridge-xeon-e5-1650-warm-caches.txt
    tmp2/mac-haswell-fma-i7-4770hq-warm-caches.txt
  Subset 1, efficiency 100%:
    tmp2/amd-opteron-6376-trevor-irons.txt
  Subset 2, efficiency 100%:
    tmp2/mac-haswell-fma-i7-4770hq.txt
  Subset 3, efficiency 100%:
    tmp2/nexus5.txt

(Obviously, when a subset contains only 1 log, efficiency is 100% by definition).
Benoit



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/