On Sun, Feb 7, 2010 at 12:38 PM, Hauke Heibel
<hauke.heibel@xxxxxxxxxxxxxx> wrote:
Do I remember correctly that the need/reason for evaluating products
into temporaries arises from the fact that the resulting code
(assuming no copies) is more efficient? Isn't this effect now shadowed
by the mallocs?
If that is the case, how about this approach:
+ instantiate temporaries as shared ptrs only when the return type of
a product has dynamic size
+ for fixed size objects don't even create a temporary but simply
evaluate the _expression_ as if all sub-expressions were lazy
Regarding the problem in general, we should try really hard to prevent
the need for using a syntax like this
(a*b).conjugate().nesteByValue().transpose()
or related code requiring to call .eval() in order to prevent crashes..
I hope we agree on this?
Currently, I am also a little bit confused about the following code
(from the forum):
double p = (P.transpose * P).diagonal().sum();
in the devel branch Product::diagonal is overloaded to handle that case without the need of an explicit lazy.
It is clear that what we can do in 2.0
double p = (P.transpose * P).lazy().diagonal().sum();
leads to way more efficient code than the default _expression_. This
would remain an issue (or better source for improvements), even if the
problem at hand with the temporaries were fixed. What I am trying to
say is that it might be a nice feature to be able to override the
default behavior and deactivate eval-before-nesting.
Finally, I think I need to update the 3.0 TODO list. Personally, I
consider this whole problem a blocking, high-priority issue for 3.0.
Cheers,
- Hauke
On Sun, Feb 7, 2010 at 10:26 AM, Gael Guennebaud
> also note that the nest everything by reference approach we have in Eigen
> 2.0 also leads to unnecessary copies:
>
> Indeed since (a*b).adjoint() is compiled as:
>
> (a*b).conjugate().nesteByValue().transpose()
>
> the result of a*b gets copied once from the Conjugate to the Transpose
> _expression_.
>
> Finaly, let emphasize again that with either the current devel approach or
> the evaluator I described in the previous email, dynamic sized matrices can
> be easily worked-around using kind of shared pointers or a list of
> temporaries respectively. So the discussion might be focused on fixed size
> objects.
>
> gael