Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.
From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
Date: Tue, 2 Mar 2010 07:13:58 -0500
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=r2AwGP//tCn+q+nm2P+B0BCuGfellhPiEO43xvxwOoU=; b=DcXO9ZBa/BANH12Bule0Mn5aPYt9x1jBDrzfXYpYrSeppQQn7keX+HrnPn+pxq04t4 dBlYiXaMNINsGYzEIFFibzJysqMCZXbBfRoSo3UJrHKoKI1r3k0yFFe6EXzBRx9KMONt hVzCG9F1sxoVLhud4EVGR8fClM8jhTgKAkWPQ=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=kgrJ4hMUM9ydoQF99mDPKgR2fbkd2HM/x6BKTEnr0E+Jmc1ThzJ6o10BVs3CiJxfsh daJKGfFj3opiW8u7ZxcW8JdRhUuCrbtCwkPCwdPqeWEZSCpb4tlZ1SkAtu1j2NQP4rK/ wLvWJfarqG5sn4eZcah/CXuJgf3+u3VFzF31Y=

2010/2/27 Eamon Nerbonne <emn13@xxxxxxxxxxxx>:
> I played with this a little.  Indeed, for ei_assign_impl< , ,
> LinearTraversal, NoUnrolling> I see the same slowdown.  The other
> ei_assign_impl's all look rather haphazard: why would ei_assign_impl< , ,
> LinearTraversal, NoUnrolling> be strong-inlined, but ei_assign_impl< , ,
> LinearTraversal, NoUnrolling> be merely inlined?
>
> Is there some kind of benchmark or systematic guideline here?  I could
> imagine that for large matrices, strong-inline is a little overkill; but
> otherwise, why not just strong-inline the bunch?

Exactly, strong-inline the bunch :)

It doesn't actually mean that the whole assignment code gets inlined.
This is just a tiny trivial function, only one small element of the
process.

Since you're already able to measure the performance impact of that,
can you send the patch?

> Obviously, small
> test-cases favor more inlining; but chances are that in whatever inner loop
> a larger program has, performance is going to be benefitted by rather more
> inlining than less...

Of course, but we also want to only put EIGEN_STRONG_INLINE where it's
been justified by a performance measurement with MSVC, firstly because
it looks ugly, secondly because in principle it's best to let the
compiler decide naturally what to inline. That's why, not having MSVC,
i'll wait for MSVC-using people to patch that :)

>
> Another funny thing is that .noalias changes things (for the statement
> a=b-c; vs. a.noalias()=b-c;).  The performance of those statements can vary
> quite a bit again probably due to inconsistent inlining in msc, though not
> as significantly so it's also visible in gcc builds.

Interesting,

Benoit

>
> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>
>
> On Fri, Feb 26, 2010 at 20:43, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
> wrote:
>>
>> Should we then do the same in other places? I mean, this applies to
>> LinearVectorizedTraversal, but how about the other traversals? They
>> all have similar code.
>>
>> I'll let the MSVC guys investigate it if they feel like it ;)
>>
>> Benoit
>>
>> 2010/2/26 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
>> > Applied and thank you a lot for your help - its highly appreciated!
>> >
>> > - Hauke
>> >
>> > On Fri, Feb 26, 2010 at 6:27 PM, Eamon Nerbonne <emn13@xxxxxxxxxxxx>
>> > wrote:
>> >> As mentioned in the forums
>> >>
>> >> (http://forum.kde.org/viewtopic.php?f=74&t=85488&sid=56568a50ee5f70d17993d387dedd9c63&start=30#p149344)
>> >> on microsoft's compiler there's a performance regression in eigen3
>> >> concerning subtraction of VectorXd's (and probably other matrices with
>> >> other
>> >> cheap operations too).  The appropriate ei_assign_impl isn't always
>> >> inlined,
>> >> and since the operation is otherwise cheap; the function call overhead
>> >> is
>> >> quite significant.  Gcc seems to inline the function; but MSC does not
>> >> when
>> >> vectorization is on (EIGEN_DONT_VECTORIZE not defined).  Replacing the
>> >> "inline" keyword with the EIGEN_STRONG_INLINE macro resolves the
>> >> problem.
>> >> Attached: patch.
>> >> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>> >
>> >
>> >
>>
>>
>
>

Follow-Ups:
- Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.
  - From: Hauke Heibel

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] new eigen2 release (2.0.12) runs slower than 2.0.0 ?
Next by Date: Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.
Previous by thread: Re: [eigen] Eigen2 --> Eigen3 perf regression patch.
Next by thread: Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/