Re: [eigen] Blas performance on mapped matrices |

[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]

On Mon, 9 Jan 2012, Sameer Agarwal wrote:

We are in the process of a significant code migration from eigen2 to
eigen3. The code uses Eigen::Map to map chunks of memory into RowMajor
matrices and operates on them. The primary operation is of the form
A.block(r, c, size1, size2) -= B * C;
[...]
Moving from eigen2 to eigen3 has resulting in a 30% performance
regression. Has something changed significantly in the way Eigen3
handles mapped matrices, or about the structure of matrix-matrix
multiplication in Eigen3 that would cause this?

`Gael did a lot of work on matrix-matrix multiplication in between Eigen2
``and Eigen3, so I guess it may be related to that.
`

`You say in a later email that you know your instruction can be rewritten
``as
`
A.block<size1, size2>(r, c) -= B * C;

`if the size of the block is known at compile time, which seems to be the
``case in the example you're concentration on (becauase the size of B and C
``are known). I assume you tested that this is not the cause of the
``performance regression.
`
You should also try writing it as
A.block<size1, size2>(r, c).noalias() -= B * C;
(assuming that is valid).

`But I guess it's something more complicated. Another knob you can play
``with is the EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD constant, which is used
``to decide which implementation to use for matrix multiplication. Try
``changing it to 16 (anything larger than 9 will do for your example).
`
Jitse