Re: [eigen] Significant perf regression probably due to bug 363 patches

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Pushed: c29d777b278c on default branch and e5f87fa9e5c2 on 3.0 branch.

Benoit

2011/11/6 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> I've uploaded a patch on Bug 363, here's the review link:
>
> http://eigen.tuxfamily.org/bz/page.cgi?id=splinter.html&bug=363&attachment=225
>
> Benoit
>
> 2011/11/6 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> OK, now I can reproduce the big 50% perf regression that you observed,
>> it's just that before I was looking only at subtests instead of the
>> final results.
>>
>> The problem is just that, despite the inline keyword, the
>> check_rows_cols_for_overflow function is not getting inlined. Adding
>> __attribute__((always_inline)) on it fixes it. patch coming up.
>>
>> Benoit
>>
>> 2011/11/6 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> Thanks, I can reproduce a 10% performance decrease on linux x86_64,
>>> g++ 4.6.1, core i7.
>>>
>>> investigating.
>>>
>>> Benoit
>>>
>>> 2011/11/6 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
>>>> Whoops, that's what you get for copyNpaste coding; the ClassDiagram function
>>>> was never intended to be called for scenario's other than in 2D (as the name
>>>> might suggest...).  The perf degradation I previously mentioned was in the
>>>> 2d case (and that was a different benchmark anyhow), so this error doesn't
>>>> invalidate those results (the code works @ 2d).  I've updated the function
>>>> to just ignore excess dimensions.  The Eigen/EigenValues vs.
>>>> Eigen/Eigenvalues issue is due to the fact that I'm running these tests on
>>>> windows where paths are case insensitive.
>>>>
>>>>
>>>> While I was changing the benchmark anyhow, I've added an option to declare
>>>> matrix sizes as Dynamic and size matrices at run-time (the code wasn't
>>>> changed to avoid temporaries, so that'll be much slower due to memory
>>>> management).  I also changed the code to scale the number of timings rather
>>>> than the workload per timing as dimensionality rises so that timings should
>>>> be comparable between different dimensions (of course, the task is harder at
>>>> higher dimensionality).
>>>>
>>>>
>>>> sample compile then:
>>>> g++ GgmLvqBench.cpp -std=c++0x  -DNDEBUG  -DLVQFLOAT=double -DLVQDIM=2 -O3
>>>> -march=native
>>>> and perhaps add -DLVQDYNAMIC or -DEIGEN_DONT_VECTORIZE
>>>>
>>>> At changeset 4284 (25822e1ace8d) update the decomposition catalogue
>>>> EigenBenchNV on GCC with 2(2) dimensions of doubles: 0.168s 189KB
>>>> EigenBenchV on GCC with 2(2) dimensions of doubles: 0.148s 268KB
>>>> EigenBenchNV on GCC with 2(Dynamic) dimensions of doubles: 0.718s 198KB
>>>> EigenBenchV on GCC with 2(Dynamic) dimensions of doubles: 0.78s 275KB
>>>> EigenBenchNV on GCC with 8(8) dimensions of floats: 1.01s 225KB
>>>> EigenBenchV on GCC with 8(8) dimensions of floats: 0.661s 275KB
>>>> EigenBenchNV on GCC with 8(Dynamic) dimensions of floats: 1.44s 202KB
>>>> EigenBenchV on GCC with 8(Dynamic) dimensions of floats: 1.17s 264KB
>>>>
>>>> At tip changeset 4321 (47b90dc56ada) Add test for Matrix(x, y) ctor static
>>>> assert added in previous changeset:
>>>> EigenBenchNV on GCC with 2(2) dimensions of doubles: 0.258s 201KB
>>>> EigenBenchV on GCC with 2(2) dimensions of doubles: 0.23s 272KB
>>>> EigenBenchNV on GCC with 2(Dynamic) dimensions of doubles: 0.844s 205KB
>>>> EigenBenchV on GCC with 2(Dynamic) dimensions of doubles: 0.909s 290KB
>>>> EigenBenchNV on GCC with 8(8) dimensions of floats: 1.13s 227KB
>>>> EigenBenchV on GCC with 8(8) dimensions of floats: 0.802s 291KB
>>>> EigenBenchNV on GCC with 8(Dynamic) dimensions of floats: 1.58s 210KB
>>>> EigenBenchV on GCC with 8(Dynamic) dimensions of floats: 1.31s 283KB
>>>>
>>>>
>>>> MSC isn't affected, and performs roughly comparably regardless:
>>>> EigenBenchNV on MSC with 2(2) dimensions of doubles: 0.3s 149KB
>>>> EigenBenchV on MSC with 2(2) dimensions of doubles: 0.252s 169KB
>>>> EigenBenchNV on MSC with 2(Dynamic) dimensions of doubles: 1s 160KB
>>>> EigenBenchV on MSC with 2(Dynamic) dimensions of doubles: 0.938s 185KB
>>>> EigenBenchNV on MSC with 8(8) dimensions of floats: 1.27s 168KB
>>>> EigenBenchV on MSC with 8(8) dimensions of floats: 1.01s 185KB
>>>> EigenBenchNV on MSC with 8(Dynamic) dimensions of floats: 1.93s 159KB
>>>> EigenBenchV on MSC with 8(Dynamic) dimensions of floats: 1.61s 184KB
>>>>
>>>>
>>>> So for GCC there's a slowdown in all scenarios although it's relatively more
>>>> significant in the cheapest (2d) cases.
>>>>
>>>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>>>
>>>>
>>>> On Sun, Nov 6, 2011 at 05:29, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>>>
>>>>> 2011/11/5 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>>>> > So this is really a bug in your code; did this ever work? Meanwhile,
>>>>> > we should try to generate an error at compile time in this case. This
>>>>> > should be easy since x and y are of templated type so we get to know
>>>>> > (in frame 3) that they're floats which doesn't make sense for (rows,
>>>>> > cols) dimensions.
>>>>>
>>>>> Done in changeset c6f51dc87530 , so your code now gives this compile
>>>>> error:
>>>>>
>>>>> In file included from eigen/Eigen/Core:289:0,
>>>>>                 from eigen/bench/BenchTimer.h:46,
>>>>>                 from Downloads/GgmLvqBench.cpp:25:
>>>>> eigen/Eigen/src/Core/PlainObjectBase.h: In member function ‘void
>>>>>
>>>>> Eigen::PlainObjectBase<Derived>::_init2(Eigen::PlainObjectBase<Derived>::Index,
>>>>> Eigen::PlainObjectBase<Derived>::Index, typename
>>>>> Eigen::internal::enable_if<(Eigen::PlainObjectBase<Derived>::Base::
>>>>> SizeAtCompileTime != 2), T0>::type*) [with T0 = float, T1 = float,
>>>>> Derived = Eigen::Matrix<float, 4, 1>,
>>>>> Eigen::PlainObjectBase<Derived>::Index = long int, typename
>>>>> Eigen::internal::enable_if<(Eigen::PlainObjectBase<Derived>::Base::
>>>>> SizeAtCompileTime != 2), T0>::type = float]’:
>>>>> eigen/Eigen/src/Core/Matrix.h:252:7:   instantiated from
>>>>> ‘Eigen::Matrix<_Scalar, _Rows, _Cols, _Options, _MaxRows,
>>>>> _MaxCols>::Matrix(const T0&, const T1&) [with T0 = float, T1 = float,
>>>>> _Scalar = float, int _Rows = 4, int _Cols = 1, int _Options = 0, int
>>>>> _MaxRows = 4, int _MaxCols = 1]’
>>>>> Downloads/GgmLvqBench.cpp:386:76:   instantiated from here
>>>>> eigen/Eigen/src/Core/PlainObjectBase.h:599:7: error: static assertion
>>>>> failed: "FLOATING_POINT_ARGUMENT_PASSED__INTEGER_WAS_EXPECTED"
>>>>>
>>>>> Benoit
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/