Re: [eigen] Slow matrix-matrix multiply

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On 02.04.2013 03:21, Sameer Agarwal wrote:
We replaced one of the more frequently called eigen expressions with a
simple three loop GEMM implementation (with some template sizing tricks)
and it instantly gives us >10% speedups. Doing the same to two other GEMM
expressions givs us an overall 30% speedup. The sizes of the matrices
involved is fairly small; in our benchmark, our matrices are of sizes 6x3,
3x3, 3x6, and are sized at compile time.

Yes, small matrices have very much room for optimization, see this bug:

http://eigen.tuxfamily.org/bz/show_bug.cgi?id=404
For small fixed sizes it should be possible to solve this with template specializations (i.e. fall back to text-book GEMM, if vectorization/blocking gives no benefit).


Another thing that bugs me are that dynamic matrices (even if only one dimension is dynamic and the other fixed and small) always fall back to the generic matrix multiplication which is mostly optimized for very large products.

Maybe it would be possible to fall back to a very simple "three loop GEMM" if the sizes are small. This could be checked at runtime or indicated by the user somehow (maybe configurable by a compile flag). If a program only uses small matrix products this might also reduce the binary size noticeably.


Christoph

--
----------------------------------------------
Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: +49 (421) 218-64252
----------------------------------------------



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/