|Re: [eigen] Nesting by reference of by value ?|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Nesting by reference of by value ?
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Wed, 18 Nov 2009 19:35:11 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=/+0BbiEVgn85PluGS6qFk3ST7jJXbkpX3OSI7ft+fJI=; b=O4IHtnZRhEIdEfNENnoSKQd5UFFuf4VbFEAhuWCIUF93+NNG31N1fXOu+4fWSUjLkI tNnKnL4286dspuNGJAEB2TVxIs/VZJR2e3zFhaoyVQusGXmpkYm5L4Aky2KL9xijqINl gC+7INpcx8gz+095uIL+1ofUD9xRkkJHFwYHE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=GkXS5D36zmWOwg0DlJEvMLkiToZ1B5pway8DlmiEsLlQoAhA4th8gJmJfsf+y9ZzN9 ErXEDeQP/o6rLUHh1C+tfCPRtt2LAwnc/uTk/uyjnBvBkBGOuAfByfVR8rxC8FwyBJZj D7hicjOsTukaSJjxx9IFa7/n0OvVOrpckwOyg=
On Wed, Nov 18, 2009 at 6:49 PM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
ok, gcc 4.2 has same issue here.
On Wed, Nov 18, 2009 at 6:45 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
2009/11/18 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
That wouldn't be the first time that g++ 4.3 is stupid, right?
> I've just played a bit with Hauke's nesting refactoring fork
> Let me recall that currently expressions are nested by reference that
> enforces the use NestByValue when a function has to return a nested
> _expression_. See for instance adjoint() which returns
> Transpose<NestByValue<CwiseUnaryOp<ei_scalar_conjugate<Scalar>, Derived> >
>>. As you can see this is pretty annoying. In Hauke's fork lightweight
> expressions (i.e., all but Matrix) are automatically nested by value. So
> need for the NestByValue workaround.
> So now the question is what about the performances ? Well I tried a very
> simple example:
> Vector4f a, b, c, d;
> c = a+b+c+d;
> and here are the respective assembly codes generated by g++ 4.3.3
It would be interesting to see g++ 4.4.
gcc 4.4 generates the same good code in both case:
movaps 112(%rsp), %xmm0
addps 96(%rsp), %xmm0
addps 80(%rsp), %xmm0
addps 64(%rsp), %xmm0
movaps %xmm0, 80(%rsp)
ok, actually I forgot the rules #1 when benchmarking gcc, never put your critical code in the main function, but put it in a separated, not inlined, function. So now, for the same computation, gcc 4.3 and 4.4 generate good code in both cases. gcc 4.2 still generates the same poor code as above.
Then I tried the same computation but with VectorXf instead of Vector4f.. Then both gcc 4.2 and 4.3 generates a better code for the inner vectorized loop when nesting by value. I observed a significant speedup here. However, gcc 4.4 generates the same code in both cases.
Then I added some scalar multiple and sub matrix operations, and well, there is no real winner, especially with gcc 4.4 which consistently generates similar code. So finally, for me this change is safe regarding the performances.
Now it would be interesting to bench MSVC as well since it seems this compiler has more difficulties to manage Eigen's code, but this is something I cannot do.