Re: [eigen] Slow matrix-matrix multiply |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
On 02.04.2013 03:21, Sameer Agarwal wrote:
We replaced one of the more frequently called eigen expressions with a
simple three loop GEMM implementation (with some template sizing tricks)
and it instantly gives us >10% speedups. Doing the same to two other GEMM
expressions givs us an overall 30% speedup. The sizes of the matrices
involved is fairly small; in our benchmark, our matrices are of sizes 6x3,
3x3, 3x6, and are sized at compile time.
Yes, small matrices have very much room for optimization, see this bug:
http://eigen.tuxfamily.org/bz/show_bug.cgi?id=404
For small fixed sizes it should be possible to solve this with template
specializations (i.e. fall back to text-book GEMM, if
vectorization/blocking gives no benefit).
Another thing that bugs me are that dynamic matrices (even if only one
dimension is dynamic and the other fixed and small) always fall back to
the generic matrix multiplication which is mostly optimized for very
large products.
Maybe it would be possible to fall back to a very simple "three loop
GEMM" if the sizes are small. This could be checked at runtime or
indicated by the user somehow (maybe configurable by a compile flag). If
a program only uses small matrix products this might also reduce the
binary size noticeably.
Christoph
--
----------------------------------------------
Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen
Tel: +49 (421) 218-64252
----------------------------------------------