Re: [eigen] Slow matrix-matrix multiply |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
On Keir's suggestion, I have updated this CL to optionally compile Eigen based routines in and out..passing -DCUSTOM_BLAS=ON/OFF to cmake switches between custom loops and eigen inside blas.hSameerOn Tue, Apr 2, 2013 at 11:42 AM, Sameer Agarwal <sameeragarwal@xxxxxxxxxx> wrote:
Here is the gerrit CL that is used for generating these numbersSameerOn Tue, Apr 2, 2013 at 11:34 AM, Sameer Agarwal <sameeragarwal@xxxxxxxxxx> wrote:
Gael and Christoph,Thank you for looking into this.Yes adding -mllvm -inline-threshold=600 makes the timing of Eigen comparable to CUSTOM_GEMM.However, I went ahead and replaced all use of small block operations in the eliminator with simple gemm and gemv implementations. And the time has dropped even further. Which would not be the case if inlining were the only thing at work here.With the increased inlining 1.02sWith custom blas 0.634sI get roughy similar numbers with g++4.2 on macos. I also tested this on linux with g++ 4.6.3, where the linear solver time goes from 0.8 to .5 seconds.SameerOn Tue, Apr 2, 2013 at 5:23 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:
On Tue, Apr 2, 2013 at 1:58 PM, Gael GuennebaudAn alternative is to add the following compiler option:
<gael.guennebaud@xxxxxxxxx> wrote:
> After adding a few always_inline attributes
-mllvm -inline-threshold=600
gael
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |