Re: [eigen] Blas performance on mapped matrices |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
On Mon, 9 Jan 2012, Sameer Agarwal wrote:
We are in the process of a significant code migration from eigen2 to
eigen3. The code uses Eigen::Map to map chunks of memory into RowMajor
matrices and operates on them. The primary operation is of the form
A.block(r, c, size1, size2) -= B * C;
[...]
Moving from eigen2 to eigen3 has resulting in a 30% performance
regression. Has something changed significantly in the way Eigen3
handles mapped matrices, or about the structure of matrix-matrix
multiplication in Eigen3 that would cause this?
Gael did a lot of work on matrix-matrix multiplication in between Eigen2
and Eigen3, so I guess it may be related to that.
You say in a later email that you know your instruction can be rewritten
as
A.block<size1, size2>(r, c) -= B * C;
if the size of the block is known at compile time, which seems to be the
case in the example you're concentration on (becauase the size of B and C
are known). I assume you tested that this is not the cause of the
performance regression.
You should also try writing it as
A.block<size1, size2>(r, c).noalias() -= B * C;
(assuming that is valid).
But I guess it's something more complicated. Another knob you can play
with is the EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD constant, which is used
to decide which implementation to use for matrix multiplication. Try
changing it to 16 (anything larger than 9 will do for your example).
Jitse