[eigen] Re: Benchmark for blocking sizes - take 2 |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen <eigen@xxxxxxxxxxxxxxxxxxx>*Subject*: [eigen] Re: Benchmark for blocking sizes - take 2*From*: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>*Date*: Mon, 23 Feb 2015 14:08:04 -0500*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=PJkf+kyeRIlV5OcH1VunzbTTeHAVH8PEUT6bIbjdR+g=; b=cu+q2m7UDEQjmqwkXWhBFsWeeQaYu65TeecqHj57uFaxNUh6r1QPNi1bBxHrRReZd/ yaOcSpQfwuv0AYn0MIwtWcUcoGU6+bdyZq+5dHr6qm6+EKl9HELNpVNSi9szp4h5FDSo hOA3O42kISaubDMfxFa3K/ZQiXTyuM0B51Pvz9QkHy2AosLRGaP1G9LFemOxlpL95EEw wMzBwYgpxMtbxlUefId+ROAsQK2RcJMx8EFULAMFkIQOlpCs5Fm1t+eNiFJKSUUJ3nuk eraYr1gMOcRG4egadt2s8vObNN0AqN7rok5NV2B1NlPQBq7xKPFNSfKA2gW06c+SFrxV orRQ==

2015-02-20 17:34 GMT-05:00 Benoit Jacob <jacob.benoit.1@xxxxxxxxxx>:

BenoitWhat's next:The new data format doesn't contain analysis, just raw timings. Once we have data from several different machines, we can start to generate tables of blocking parameters that are "least bad" across multiple machines. Of course, users who want absolutely optimal perf on their machine will have to generate a table for it, but Eigen should default to parameters that minimize the efficiency across all machines. So the next step is to generate such a table of parameters by aggregating data from multiple logs.

An analysis program is now in bench/analyze-blocking-sizes.cpp. It takes a list of logs and tries to find good partitionings.

Example on current log-blocking-sizes/v2 data:

Partition into 4 subsets for 91.2% efficiency

Subset 0, efficiency 91.2%:

tmp2/sandy-bridge-xeon-e5-1650.txt

tmp2/sandy-bridge-xeon-e5-1650-warm-caches.txt

tmp2/mac-haswell-fma-i7-4770hq-warm-caches.txt

Subset 1, efficiency 100%:

tmp2/amd-opteron-6376-trevor-irons.txt

Subset 2, efficiency 100%:

tmp2/mac-haswell-fma-i7-4770hq.txt

Subset 3, efficiency 100%:

tmp2/nexus5.txt

Partition into 4 subsets for 91.2% efficiency

Subset 0, efficiency 91.2%:

tmp2/sandy-bridge-xeon-e5-1650.txt

tmp2/sandy-bridge-xeon-e5-1650-warm-caches.txt

tmp2/mac-haswell-fma-i7-4770hq-warm-caches.txt

Subset 1, efficiency 100%:

tmp2/amd-opteron-6376-trevor-irons.txt

Subset 2, efficiency 100%:

tmp2/mac-haswell-fma-i7-4770hq.txt

Subset 3, efficiency 100%:

tmp2/nexus5.txt

(Obviously, when a subset contains only 1 log, efficiency is 100% by definition).

**References**:**[eigen] Benchmark for blocking sizes - take 2***From:*Benoit Jacob

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] Tracking performance regressions** - Next by Date:
**Re: [eigen] Tracking performance regressions** - Previous by thread:
**[eigen] Re: Benchmark for blocking sizes - take 2** - Next by thread:
**Re: [eigen] Specializing max_coeff_visitor for some number types**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |