Re: [eigen] Slow matrix-matrix multiply

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Slow matrix-matrix multiply
From: Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 02 Apr 2013 12:14:33 +0200

On 02.04.2013 11:26, Gael Guennebaud wrote:

For small dynamic-sizes matrices, I agree there is room for
optimization. However, for small fixed-sizes matrices, Eigen should
already be at least as fast as a naive implementation.

I can also reproduce the performance drop with linux/gcc-4.7. However,
the generated assembly in both cases are extremely similar (see the
attached files), with even an advantage to Eigen with only 18
additions compared to 27 for custom_gemm. Frankly, I cannot explain
the perf difference.

Did you (or anybody else) checked how well instruction latencies arecompensated? An interesting tool for that seems to be this (never triedit myself, though):

http://software.intel.com/en-us/articles/intel-architecture-code-analyzer-download/

Bad thing when going down to that level of optimization is thatlatencies are quite CPU dependent. My favorite resource for that:

http://www.agner.org/optimize/instruction_tables.pdf

Christoph


--
----------------------------------------------
Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: +49 (421) 218-64252
----------------------------------------------

References:
- [eigen] Slow matrix-matrix multiply
  - From: Sameer Agarwal
- Re: [eigen] Slow matrix-matrix multiply
  - From: Christoph Hertzberg
- Re: [eigen] Slow matrix-matrix multiply
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Slow matrix-matrix multiply
Next by Date: Re: [eigen] Slow matrix-matrix multiply
Previous by thread: Re: [eigen] Slow matrix-matrix multiply
Next by thread: Re: [eigen] Slow matrix-matrix multiply

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/