Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.

Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.
From: Eamon Nerbonne <emn13@xxxxxxxxxxxx>
Date: Sun, 28 Feb 2010 00:46:26 +0100
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type; bh=IppQ40HxGAVIIiwsGzXUI8Xro+yhgdrKcQC6cKxlzkg=; b=kUoo+szveLUT2G7tnhnejhYvWIn+Qg5j7r8bD9MIY75XnGhte0aHu04ixygXFd378z 9u7NePT/m/OAIcJR0IIChlAFKQeAmmwedKZgNP/pWBlLtGNEkVeTs4AguApigXjUw/pf GgmI+nZKZkwwMRryCsila6LN+UVPYcLIs6j24=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; b=d0Mr6MlAL+RuvM5qAiQaN89/d/b41D3pUY3rG+nzzQHa3E7+Ee3NJxTTa1BYLNhlhW yNBIq0t80RMI0sCS3tSCE/5bX8egNNHiIOjcRtShc6JGODR0w6Cpkhk+cPNVLSqixCUn 8F/hk4ji3O9RUwYpOo+pgf3N4+RTqMxVOy6kA=

I played with this a little. Indeed, for ei_assign_impl< , , LinearTraversal, NoUnrolling> I see the same slowdown. The other ei_assign_impl's all look rather haphazard: why would ei_assign_impl< , , LinearTraversal, NoUnrolling> be strong-inlined, but ei_assign_impl< , , LinearTraversal, NoUnrolling> be merely inlined?

Is there some kind of benchmark or systematic guideline here? I could imagine that for large matrices, strong-inline is a little overkill; but otherwise, why not just strong-inline the bunch? Obviously, small test-cases favor more inlining; but chances are that in whatever inner loop a larger program has, performance is going to be benefitted by rather more inlining than less...

Another funny thing is that .noalias changes things (for the statement a=b-c; vs. a.noalias()=b-c;). The performance of those statements can vary quite a bit again probably due to inconsistent inlining in msc, though not as significantly so it's also visible in gcc builds.

--eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163

On Fri, Feb 26, 2010 at 20:43, Benoit Jacob <jacob..benoit.1@xxxxxxxxx> wrote:

Should we then do the same in other places? I mean, this applies to
LinearVectorizedTraversal, but how about the other traversals? They
all have similar code.

I'll let the MSVC guys investigate it if they feel like it ;)

Benoit

2010/2/26 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:

> Applied and thank you a lot for your help - its highly appreciated!
>
> - Hauke
>
> On Fri, Feb 26, 2010 at 6:27 PM, Eamon Nerbonne <emn13@xxxxxxxxxxxx> wrote:
>> As mentioned in the forums
>> (http://forum.kde.org/viewtopic.php?f=74&t=85488&sid=56568a50ee5f70d17993d387dedd9c63&start=30#p149344)
>> on microsoft's compiler there's a performance regression in eigen3
>> concerning subtraction of VectorXd's (and probably other matrices with other
>> cheap operations too). The appropriate ei_assign_impl isn't always inlined,
>> and since the operation is otherwise cheap; the function call overhead is
>> quite significant. Gcc seems to inline the function; but MSC does not when
>> vectorization is on (EIGEN_DONT_VECTORIZE not defined). Replacing the
>> "inline" keyword with the EIGEN_STRONG_INLINE macro resolves the problem.
>> Attached: patch.
>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>
>
>

References:
- [eigen] Eigen3 ->Eigen2 performance regression: patch.
  - From: Eamon Nerbonne
- Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.
  - From: Hauke Heibel
- Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.
  - From: Benoit Jacob

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] portable reallocation...
Next by Date: Re: [eigen] [patch] LDLt decomposition with rank-deficient matrices
Previous by thread: Re: [eigen] Eigen3 ->Eigen2 performance regression: patch.
Next by thread: [eigen] intiial ARM NEON results

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/