Re: [eigen] Blas performance on mapped matrices

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Mon, 9 Jan 2012, Sameer Agarwal wrote:

We are in the process of a significant code migration from eigen2 to
eigen3. The code uses Eigen::Map to map chunks of memory into RowMajor
matrices and operates on them. The primary operation is of the form

A.block(r, c, size1, size2) -= B * C;
[...]
Moving from eigen2 to eigen3 has resulting in a 30% performance
regression. Has something changed significantly in the way Eigen3
handles mapped matrices, or about the structure of matrix-matrix
multiplication in Eigen3 that would cause this?

Gael did a lot of work on matrix-matrix multiplication in between Eigen2 and Eigen3, so I guess it may be related to that.

You say in a later email that you know your instruction can be rewritten as

   A.block<size1, size2>(r, c) -= B * C;

if the size of the block is known at compile time, which seems to be the case in the example you're concentration on (becauase the size of B and C are known). I assume you tested that this is not the cause of the performance regression.

You should also try writing it as

   A.block<size1, size2>(r, c).noalias() -= B * C;

(assuming that is valid).

But I guess it's something more complicated. Another knob you can play with is the EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD constant, which is used to decide which implementation to use for matrix multiplication. Try changing it to 16 (anything larger than 9 will do for your example).


Jitse



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/