Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2010/2/27 Eamon Nerbonne <emn13@xxxxxxxxxxxx>:
> I played with this a little.  Indeed, for ei_assign_impl< , ,
> LinearTraversal, NoUnrolling> I see the same slowdown.  The other
> ei_assign_impl's all look rather haphazard: why would ei_assign_impl< , ,
> LinearTraversal, NoUnrolling> be strong-inlined, but ei_assign_impl< , ,
> LinearTraversal, NoUnrolling> be merely inlined?
>
> Is there some kind of benchmark or systematic guideline here?  I could
> imagine that for large matrices, strong-inline is a little overkill; but
> otherwise, why not just strong-inline the bunch?

Exactly, strong-inline the bunch :)

It doesn't actually mean that the whole assignment code gets inlined.
This is just a tiny trivial function, only one small element of the
process.

Since you're already able to measure the performance impact of that,
can you send the patch?

> Obviously, small
> test-cases favor more inlining; but chances are that in whatever inner loop
> a larger program has, performance is going to be benefitted by rather more
> inlining than less...

Of course, but we also want to only put EIGEN_STRONG_INLINE where it's
been justified by a performance measurement with MSVC, firstly because
it looks ugly, secondly because in principle it's best to let the
compiler decide naturally what to inline. That's why, not having MSVC,
i'll wait for MSVC-using people to patch that :)

>
> Another funny thing is that .noalias changes things (for the statement
> a=b-c; vs. a.noalias()=b-c;).  The performance of those statements can vary
> quite a bit again probably due to inconsistent inlining in msc, though not
> as significantly so it's also visible in gcc builds.

Interesting,

Benoit

>
> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>
>
> On Fri, Feb 26, 2010 at 20:43, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
> wrote:
>>
>> Should we then do the same in other places? I mean, this applies to
>> LinearVectorizedTraversal, but how about the other traversals? They
>> all have similar code.
>>
>> I'll let the MSVC guys investigate it if they feel like it ;)
>>
>> Benoit
>>
>> 2010/2/26 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
>> > Applied and thank you a lot for your help - its highly appreciated!
>> >
>> > - Hauke
>> >
>> > On Fri, Feb 26, 2010 at 6:27 PM, Eamon Nerbonne <emn13@xxxxxxxxxxxx>
>> > wrote:
>> >> As mentioned in the forums
>> >>
>> >> (http://forum.kde.org/viewtopic.php?f=74&t=85488&sid=56568a50ee5f70d17993d387dedd9c63&start=30#p149344)
>> >> on microsoft's compiler there's a performance regression in eigen3
>> >> concerning subtraction of VectorXd's (and probably other matrices with
>> >> other
>> >> cheap operations too).  The appropriate ei_assign_impl isn't always
>> >> inlined,
>> >> and since the operation is otherwise cheap; the function call overhead
>> >> is
>> >> quite significant.  Gcc seems to inline the function; but MSC does not
>> >> when
>> >> vectorization is on (EIGEN_DONT_VECTORIZE not defined).  Replacing the
>> >> "inline" keyword with the EIGEN_STRONG_INLINE macro resolves the
>> >> problem.
>> >> Attached: patch.
>> >> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>> >
>> >
>> >
>>
>>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/