Re: [eigen] Slow matrix-matrix multiply

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


still cannot reproduce with gcc:

-- default - gcc47 - Core2 Q9400 @2.66GHz --

Time (in seconds):
Preprocessor                            0.093

  Residual Evaluations                  0.117
  Jacobian Evaluations                  1.067
  Linear Solver                         0.809
Minimizer                               2.237

Postprocessor                           0.005
Total                                   2.371

-- CERES_NO_CUSTOM_BLAS - gcc47 - Core2 Q9400 @2.66GHz --

Time (in seconds):
Preprocessor                            0.089

  Residual Evaluations                  0.108
  Jacobian Evaluations                  1.054
  Linear Solver                         0.803
Minimizer                               2.206

Postprocessor                           0.005
Total                                   2.335


-- default - gcc47 - Xeon X5570 @2.93GHz --

Time (in seconds):
Preprocessor                            0.067

  Residual Evaluations                  0.085
  Jacobian Evaluations                  0.720
  Linear Solver                         0.600
Minimizer                               1.557

Postprocessor                           0.001
Total                                   1.645

-- CERES_NO_CUSTOM_BLAS - gcc47 - Xeon X5570 @2.93GHz --

Time (in seconds):
Preprocessor                            0.067

  Residual Evaluations                  0.085
  Jacobian Evaluations                  0.734
  Linear Solver                         0.599
Minimizer                               1.570

Postprocessor                           0.001
Total                                   1.658

gael

On Wed, Apr 3, 2013 at 5:58 AM, Sameer Agarwal <sameeragarwal@xxxxxxxxxx> wrote:
> In case there is still interest, the change has been merged into the master
> branch.
> Sameer
>
>
>
>
> On Tue, Apr 2, 2013 at 12:25 PM, Sameer Agarwal <sameeragarwal@xxxxxxxxxx>
> wrote:
>>
>> On Keir's suggestion, I have updated this CL to optionally compile Eigen
>> based routines in and out.
>>
>> passing -DCUSTOM_BLAS=ON/OFF to cmake switches between custom loops and
>> eigen inside blas.h
>>
>> Sameer
>>
>>
>>
>> On Tue, Apr 2, 2013 at 11:42 AM, Sameer Agarwal <sameeragarwal@xxxxxxxxxx>
>> wrote:
>>>
>>> Here is the gerrit CL that is used for generating these numbers
>>>
>>> https://ceres-solver-review.googlesource.com/#/c/2870/
>>>
>>> Sameer
>>>
>>>
>>>
>>> On Tue, Apr 2, 2013 at 11:34 AM, Sameer Agarwal
>>> <sameeragarwal@xxxxxxxxxx> wrote:
>>>>
>>>> Gael and Christoph,
>>>>
>>>> Thank you for looking into this.
>>>>
>>>> Yes adding -mllvm -inline-threshold=600 makes the timing of Eigen
>>>> comparable to CUSTOM_GEMM.
>>>>
>>>> However, I went ahead and replaced all use of small block operations in
>>>> the eliminator with simple gemm and gemv implementations. And the time has
>>>> dropped even further.  Which would not be the case if inlining were the only
>>>> thing at work here.
>>>>
>>>> With the increased inlining 1.02s
>>>> With custom blas            0.634s
>>>>
>>>> I get roughy similar numbers with g++4.2 on macos. I also tested this on
>>>> linux with g++ 4.6.3, where the linear solver time goes from 0.8 to .5
>>>> seconds.
>>>>
>>>> Sameer
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Apr 2, 2013 at 5:23 AM, Gael Guennebaud
>>>> <gael.guennebaud@xxxxxxxxx> wrote:
>>>>>
>>>>> On Tue, Apr 2, 2013 at 1:58 PM, Gael Guennebaud
>>>>> <gael.guennebaud@xxxxxxxxx> wrote:
>>>>> > After adding a few always_inline attributes
>>>>>
>>>>> An alternative is to add the following compiler option:
>>>>>
>>>>> -mllvm -inline-threshold=600
>>>>>
>>>>> gael
>>>>>
>>>>>
>>>>
>>>
>>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/