|Re: [eigen] array functionality...|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] array functionality...
- From: Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>
- Date: Tue, 9 Mar 2010 21:01:52 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=i/dZuR9DuNHn6UHZcL+t3VTHYYq+bP7C5zwwPVuSZxo=; b=DFrxdJjY8FPt0IViy5BAn6PRj07YCZxVqsDUUsA9F2f/uto2OlA4XLXThyov8Wew9l 4TWbP4kmg+TjjJGklr2Y46Fp9Y9D4YfJEk3bO2JhIhr06xzWNxeM1h4pqwtGJsOSEEog TBWi6OobPk35YeuG6WmoOHnofDvCmy55Po3Gw=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=HD6nDVTDg1jRkkpUJCpvTUeN2daGjdGXm4+UnPd2FjVL5WDOJnoNdWxbm1caen4gIB vyug7uHwBMBRhGK83FXQzrKoHrR0YB4QvcdJL/ZzhVNJ4XUo8PPmHl0shPs5kuEI21Vt 2g7CjRLxzbvys/Va9eOTg+KNmXt7L22fP425w=
Just wanted to let you know that GCC performs as expected - ignoring
what Benoit just confirmed to be probably a little bug.
method man.: 0.172443
method a...: 0.148587
method b...: 0.149701
method c...: 0.584348
Expected in the sense that GCC'ed Eigen beats the manual path.
On Tue, Mar 9, 2010 at 8:42 PM, Hauke Heibel
> On Tue, Mar 9, 2010 at 3:26 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>> Just did that and the Eigen-fied version
>>> norms = (x.replicate(1,y.cols()) - y).matrix().squaredNorm()
>>> is way slower...
>> How about using a colwise() here?
> Which is what I actually did - it was just a typo. I also know right
> now, why this is so much slower. The issue is that the final reduction
> does not see that this is vectorizable so an unvectorized path is
>> (Dont remember for sure if squaredNorm is available in partial
>> reductions, but if it's not then it's easy to add, or you can replace
>> by this:
>> norms = (x-y).abs2().colwise().sum()
> That one was a quite good hint since now I am getting vectorization.
> I attached an example of computing the column-wise squared norm of a
> matrix. I tried out four possibilities.
> 1) manual (0.163722 secs)
> 2) semi-manual, loop+abs2().sum() (0.360112 secs)
> 3) semi-manual, loop+matrix().squaredNorm() (0.358127 secs)
> 4) full-automatic (1.1833 secs)
> On MSVC 1) is the clear winner - probably and hopefully, in GCC 1/2
> and 3 will be en par
> 2) and 3) perform nearly identical
> 4) is loosing since a non-vectorized path is chosen
> I don't want to cause more work than you already have right now - so
> letting this topic rest is fine with me.
> There is only one thing I would like to bring up for the future. Eigen
> is offering many possibilities to solve one and the same problem. In
> general, it is clear that not all of them offer or even can offer the
> same performance -- nonetheless I think we might consider making
> people more sensitive about this fact by adding some information to
> the docs.
> I will put a marker on this post and try to find some time in the future.
> - Hauke