Re: [eigen] Slow matrix-matrix multiply

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Keir's suggestion, I have updated this CL to optionally compile Eigen based routines in and out. 

passing -DCUSTOM_BLAS=ON/OFF to cmake switches between custom loops and eigen inside blas.h

Sameer



On Tue, Apr 2, 2013 at 11:42 AM, Sameer Agarwal <sameeragarwal@xxxxxxxxxx> wrote:
Here is the gerrit CL that is used for generating these numbers

https://ceres-solver-review.googlesource.com/#/c/2870/

Sameer



On Tue, Apr 2, 2013 at 11:34 AM, Sameer Agarwal <sameeragarwal@xxxxxxxxxx> wrote:
Gael and Christoph,

Thank you for looking into this. 

Yes adding -mllvm -inline-threshold=600 makes the timing of Eigen comparable to CUSTOM_GEMM.

However, I went ahead and replaced all use of small block operations in the eliminator with simple gemm and gemv implementations. And the time has dropped even further.  Which would not be the case if inlining were the only thing at work here.

With the increased inlining 1.02s
With custom blas            0.634s

I get roughy similar numbers with g++4.2 on macos. I also tested this on linux with g++ 4.6.3, where the linear solver time goes from 0.8 to .5 seconds.

Sameer





On Tue, Apr 2, 2013 at 5:23 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:
On Tue, Apr 2, 2013 at 1:58 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> After adding a few always_inline attributes

An alternative is to add the following compiler option:

-mllvm -inline-threshold=600

gael







Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/