|Re: [eigen] Comparing notes on work by Igleberger et al. 2012|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Comparing notes on work by Igleberger et al. 2012
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Tue, 28 Aug 2012 16:47:02 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=nCyVz5Bz4qXCTpNGRZGExZtQg8D6nHM3wVz+0XlfP0c=; b=CCcZ5/RwwNrIxrwuw1d0J1C5Yp3OjceR0CVkK0jLCyG2zVE2v5Oqpk26QojayAIMmh 2r44G7krMOTYHqrb9EP7oTw1H66Yr0eXmhMhhtmd0gmSRyqc1nO2Uom5KKL6sEL8a4gW Jcmb6cpLALsf5vooDgTLCPj3ruxLTiwx1P4jD3NhW1kF6lXbLXMmqMFJXx1J1ufTB39m GHnNTlH+g4VdBpbaTZQFnL9ygyQ6mj41/80f/quQIqeGCgmCE0ASqruCh8Wk0Hcr4R+k Kgkry++xS8JxEN0lwmG3piXnOCEvJaH4zjxAx9oR0fMstrR4Is5z3qeaP8Jfc3TiXffi cFvQ==
Another funny fact for a so called "smart expression template
library": they have 24 copy-pasted implementations of dense
3 for =, +=, and -=
* 2 for row/column major
* 2 for a scaled or unscaled product (e.g., 2*A*v vs A*v)
* 2 for vectorized, non vectorized versions
As I said, none of the vectorized version is able to handle non
perfectly aligned and padded matrices. Eigen has only 2 variants for
row versus column major.
On Tue, Aug 28, 2012 at 4:39 PM, Gael Guennebaud
> On Tue, Aug 28, 2012 at 4:24 PM, Rhys Ulerich <rhys.ulerich@xxxxxxxxx> wrote:
>> Hi all,
>> I noticed Blaze on the NA Digest list  on the NA Digest list, read
> I noticed it too.
>> the associated paper , and wondered if I could compare my takeaway
>> knowledge with anyone else familiar with both Blaze and Eigen:
>> 1) At the time of writing, Blaze shows faster dgemm than Eigen3
>> because they simply defer to the MKL. This is moot as Eigen 3.1
>> allows use of the MKL as well.
> Partly. Blaze is also able to exploit AVX instruction (so a
> theoretical x2 compared to SSE), however Blaze suffers from a HUGE
> shortcoming: data are assumed to be perfectly aligned, including
> inside a matrix. For instance, a row-major matrix is padded with empty
> space such that each row is aligned. The extra space *must* be filled
> with zeros and nothing else because they are exploited during some
> computations... As a consequence, Blaze does not has any notion of
> sub-matrices or the like, cannot map external data, etc. One cannot
> even write a LU decomposition on top of Blaze. In other word, it is
> not usable at all. This is also unfair regarding the benchmarks
> because they are comparing Blaze with perfectly aligned matrices to
> Eigen or MKL with packed and unaligned matrices.
>> 2) Blaze shows faster performance on A*B*v for A and B matrices
>> because they don't honor order of operations and their expression
>> templates treat it as A*(B*v). This is moot as I can simply write
>> A*(B*v) in Eigen.
> Exactly, though we plane to do the same (and more) once the evaluators
>> 3) Blaze does show some convincingly better results for mixed
>> dense/sparse operations.
> Indeed, compressed sparse representations offers very little room for
> optimizing basic operations.
>>  http://www.netlib.org/na-digest-html/12/v12n35.html#1
>>  http://dx.doi.org/10.1137/110830125