Re: [eigen] Significant perf regression probably due to bug 363 patches |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Significant perf regression probably due to bug 363 patches
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Sun, 6 Nov 2011 15:36:08 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=4hVlPNZ0qI1bGn5AuokvRnZVQwd30KUhpBK3WZfr9jA=; b=ZNlJPbvXNanGy4FPB4IQTzU5mOKxKFAt4zZaBMWWk5jD3F4Z2KydKe+oJdcb1YqPJ5 WOn3aAxCDNDZZhpZbXK+XvyHPdl4YctV26c95FtYXBu6b8wzfYhjVMpSojr6KsWEzKwr GxRITtLOnfOafkbjbv81LPAGWaDWlyXRCYhxY=
Pushed: c29d777b278c on default branch and e5f87fa9e5c2 on 3.0 branch.
Benoit
2011/11/6 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> I've uploaded a patch on Bug 363, here's the review link:
>
> http://eigen.tuxfamily.org/bz/page.cgi?id=splinter.html&bug=363&attachment=225
>
> Benoit
>
> 2011/11/6 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> OK, now I can reproduce the big 50% perf regression that you observed,
>> it's just that before I was looking only at subtests instead of the
>> final results.
>>
>> The problem is just that, despite the inline keyword, the
>> check_rows_cols_for_overflow function is not getting inlined. Adding
>> __attribute__((always_inline)) on it fixes it. patch coming up.
>>
>> Benoit
>>
>> 2011/11/6 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> Thanks, I can reproduce a 10% performance decrease on linux x86_64,
>>> g++ 4.6.1, core i7.
>>>
>>> investigating.
>>>
>>> Benoit
>>>
>>> 2011/11/6 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
>>>> Whoops, that's what you get for copyNpaste coding; the ClassDiagram function
>>>> was never intended to be called for scenario's other than in 2D (as the name
>>>> might suggest...). The perf degradation I previously mentioned was in the
>>>> 2d case (and that was a different benchmark anyhow), so this error doesn't
>>>> invalidate those results (the code works @ 2d). I've updated the function
>>>> to just ignore excess dimensions. The Eigen/EigenValues vs.
>>>> Eigen/Eigenvalues issue is due to the fact that I'm running these tests on
>>>> windows where paths are case insensitive.
>>>>
>>>>
>>>> While I was changing the benchmark anyhow, I've added an option to declare
>>>> matrix sizes as Dynamic and size matrices at run-time (the code wasn't
>>>> changed to avoid temporaries, so that'll be much slower due to memory
>>>> management). I also changed the code to scale the number of timings rather
>>>> than the workload per timing as dimensionality rises so that timings should
>>>> be comparable between different dimensions (of course, the task is harder at
>>>> higher dimensionality).
>>>>
>>>>
>>>> sample compile then:
>>>> g++ GgmLvqBench.cpp -std=c++0x -DNDEBUG -DLVQFLOAT=double -DLVQDIM=2 -O3
>>>> -march=native
>>>> and perhaps add -DLVQDYNAMIC or -DEIGEN_DONT_VECTORIZE
>>>>
>>>> At changeset 4284 (25822e1ace8d) update the decomposition catalogue
>>>> EigenBenchNV on GCC with 2(2) dimensions of doubles: 0.168s 189KB
>>>> EigenBenchV on GCC with 2(2) dimensions of doubles: 0.148s 268KB
>>>> EigenBenchNV on GCC with 2(Dynamic) dimensions of doubles: 0.718s 198KB
>>>> EigenBenchV on GCC with 2(Dynamic) dimensions of doubles: 0.78s 275KB
>>>> EigenBenchNV on GCC with 8(8) dimensions of floats: 1.01s 225KB
>>>> EigenBenchV on GCC with 8(8) dimensions of floats: 0.661s 275KB
>>>> EigenBenchNV on GCC with 8(Dynamic) dimensions of floats: 1.44s 202KB
>>>> EigenBenchV on GCC with 8(Dynamic) dimensions of floats: 1.17s 264KB
>>>>
>>>> At tip changeset 4321 (47b90dc56ada) Add test for Matrix(x, y) ctor static
>>>> assert added in previous changeset:
>>>> EigenBenchNV on GCC with 2(2) dimensions of doubles: 0.258s 201KB
>>>> EigenBenchV on GCC with 2(2) dimensions of doubles: 0.23s 272KB
>>>> EigenBenchNV on GCC with 2(Dynamic) dimensions of doubles: 0.844s 205KB
>>>> EigenBenchV on GCC with 2(Dynamic) dimensions of doubles: 0.909s 290KB
>>>> EigenBenchNV on GCC with 8(8) dimensions of floats: 1.13s 227KB
>>>> EigenBenchV on GCC with 8(8) dimensions of floats: 0.802s 291KB
>>>> EigenBenchNV on GCC with 8(Dynamic) dimensions of floats: 1.58s 210KB
>>>> EigenBenchV on GCC with 8(Dynamic) dimensions of floats: 1.31s 283KB
>>>>
>>>>
>>>> MSC isn't affected, and performs roughly comparably regardless:
>>>> EigenBenchNV on MSC with 2(2) dimensions of doubles: 0.3s 149KB
>>>> EigenBenchV on MSC with 2(2) dimensions of doubles: 0.252s 169KB
>>>> EigenBenchNV on MSC with 2(Dynamic) dimensions of doubles: 1s 160KB
>>>> EigenBenchV on MSC with 2(Dynamic) dimensions of doubles: 0.938s 185KB
>>>> EigenBenchNV on MSC with 8(8) dimensions of floats: 1.27s 168KB
>>>> EigenBenchV on MSC with 8(8) dimensions of floats: 1.01s 185KB
>>>> EigenBenchNV on MSC with 8(Dynamic) dimensions of floats: 1.93s 159KB
>>>> EigenBenchV on MSC with 8(Dynamic) dimensions of floats: 1.61s 184KB
>>>>
>>>>
>>>> So for GCC there's a slowdown in all scenarios although it's relatively more
>>>> significant in the cheapest (2d) cases.
>>>>
>>>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>>>
>>>>
>>>> On Sun, Nov 6, 2011 at 05:29, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>>>
>>>>> 2011/11/5 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>>>> > So this is really a bug in your code; did this ever work? Meanwhile,
>>>>> > we should try to generate an error at compile time in this case. This
>>>>> > should be easy since x and y are of templated type so we get to know
>>>>> > (in frame 3) that they're floats which doesn't make sense for (rows,
>>>>> > cols) dimensions.
>>>>>
>>>>> Done in changeset c6f51dc87530 , so your code now gives this compile
>>>>> error:
>>>>>
>>>>> In file included from eigen/Eigen/Core:289:0,
>>>>> from eigen/bench/BenchTimer.h:46,
>>>>> from Downloads/GgmLvqBench.cpp:25:
>>>>> eigen/Eigen/src/Core/PlainObjectBase.h: In member function ‘void
>>>>>
>>>>> Eigen::PlainObjectBase<Derived>::_init2(Eigen::PlainObjectBase<Derived>::Index,
>>>>> Eigen::PlainObjectBase<Derived>::Index, typename
>>>>> Eigen::internal::enable_if<(Eigen::PlainObjectBase<Derived>::Base::
>>>>> SizeAtCompileTime != 2), T0>::type*) [with T0 = float, T1 = float,
>>>>> Derived = Eigen::Matrix<float, 4, 1>,
>>>>> Eigen::PlainObjectBase<Derived>::Index = long int, typename
>>>>> Eigen::internal::enable_if<(Eigen::PlainObjectBase<Derived>::Base::
>>>>> SizeAtCompileTime != 2), T0>::type = float]’:
>>>>> eigen/Eigen/src/Core/Matrix.h:252:7: instantiated from
>>>>> ‘Eigen::Matrix<_Scalar, _Rows, _Cols, _Options, _MaxRows,
>>>>> _MaxCols>::Matrix(const T0&, const T1&) [with T0 = float, T1 = float,
>>>>> _Scalar = float, int _Rows = 4, int _Cols = 1, int _Options = 0, int
>>>>> _MaxRows = 4, int _MaxCols = 1]’
>>>>> Downloads/GgmLvqBench.cpp:386:76: instantiated from here
>>>>> eigen/Eigen/src/Core/PlainObjectBase.h:599:7: error: static assertion
>>>>> failed: "FLOATING_POINT_ARGUMENT_PASSED__INTEGER_WAS_EXPECTED"
>>>>>
>>>>> Benoit
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>