Re: [eigen] nesting

[ Thread Index | Date Index | More Archives ]

On Sun, Feb 7, 2010 at 4:09 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
2010/2/7 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
> Do I remember correctly that the need/reason for evaluating products
> into temporaries arises from the fact that the resulting code
> (assuming no copies) is more efficient? Isn't this effect now shadowed
> by the mallocs?

There are other motivations, too. Evaluating large products into
temporaries, allowing to use the cache-friendly functions, instead of
having to evaluate them coefficient by coefficient inside of a larger
_expression_, can be hugely important. It can be more that 10x. (To make
things worse, imagine that the bigger _expression_ can't be vectorized).

Moreover, it's not like we have to choose between 2 optimizations: we
should be able to get both simultaneously.

> Regarding the problem in general, we should try really hard to prevent
> the need for using a syntax like this
>  (a*b).conjugate().nesteByValue().transpose()
> or related code requiring to call .eval() in order to prevent crashes.
> I hope we agree on this?

Sure! That's also one more big reason why your nest-by-value changes
are so good!

> Currently, I am also a little bit confused about the following code
> (from the forum):
>  double p = (P.transpose * P).diagonal().sum();
> It is clear that what we can do in 2.0
>  double p = (P.transpose * P).lazy().diagonal().sum();
> leads to way more efficient code than the default _expression_. This
> would remain an issue (or better source for improvements), even if the
> problem at hand with the temporaries were fixed. What I am trying to
> say is that it might be a nice feature to be able to override the
> default behavior and deactivate eval-before-nesting.

So, reintroduce .lazy() as an advanced feature for users who know what
they're doing. Good idea, it doesn't hurt as long as we explain in the
documentation that users should first try noalias() as it is in most
cases what they want.

I agree too, though instead of reintroducing .lazy(), I would rather add a a.lazyProduct(b) function: more explicit and easier to use (no where do I have to put the lazy question).


But let's also say that

>  double p = (P.transpose * P).lazy().diagonal().sum();

can equivalently (and even more efficiently) be done as:

   double p = P.array().abs2().sum();

And this is not an isolated special case. The "diagonal in a product"
and more generally "some coefficients in a product" expressions
inherently simplify as expressions in terms of the coefficients.

> Finally, I think I need to update the 3.0 TODO list. Personally, I
> consider this whole problem a blocking, high-priority issue for 3.0.

I think we all agree on this. Good idea, and anyway I don't think we
can forget about it now :)


> Cheers,
> - Hauke
> On Sun, Feb 7, 2010 at 10:26 AM, Gael Guennebaud
> <gael.guennebaud@xxxxxxxxxx> wrote:
>> also note that the nest everything by reference approach we have in Eigen
>> 2.0 also leads to unnecessary copies:
>> Indeed since (a*b).adjoint() is compiled as:
>> (a*b).conjugate().nesteByValue().transpose()
>> the result of a*b gets copied once from the Conjugate to the Transpose
>> _expression_.
>> Finaly, let emphasize again that with either the current devel approach or
>> the evaluator I described in the previous email, dynamic sized matrices can
>> be easily worked-around using kind of shared pointers or a list of
>> temporaries respectively. So the discussion might be focused on fixed size
>> objects.
>> gael

Mail converted by MHonArc 2.6.19+