Re: [eigen] nesting

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2010/2/7 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> 2010/2/7 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
>> Do I remember correctly that the need/reason for evaluating products
>> into temporaries arises from the fact that the resulting code
>> (assuming no copies) is more efficient? Isn't this effect now shadowed
>> by the mallocs?
>
> There are other motivations, too. Evaluating large products into
> temporaries, allowing to use the cache-friendly functions, instead of
> having to evaluate them coefficient by coefficient inside of a larger
> expression, can be hugely important. It can be more that 10x. (To make
> things worse, imagine that the bigger expression can't be vectorized).

Oooh right, but then you were saying that in the case of fixed-size matrices.

I seem to remember that even in that case, we measured that evaluating
into temporaries was beneficial, but I'm not sure anymore of the
reason. In the special case where the big expression can't be
vectorized, that's more clear, though.

Benoit

>
> Moreover, it's not like we have to choose between 2 optimizations: we
> should be able to get both simultaneously.
>
>> Regarding the problem in general, we should try really hard to prevent
>> the need for using a syntax like this
>>
>>  (a*b).conjugate().nesteByValue().transpose()
>>
>> or related code requiring to call .eval() in order to prevent crashes.
>> I hope we agree on this?
>
> Sure! That's also one more big reason why your nest-by-value changes
> are so good!
>
>>
>> Currently, I am also a little bit confused about the following code
>> (from the forum):
>>
>>  double p = (P.transpose * P).diagonal().sum();
>>
>> It is clear that what we can do in 2.0
>>
>>  double p = (P.transpose * P).lazy().diagonal().sum();
>>
>> leads to way more efficient code than the default expression. This
>> would remain an issue (or better source for improvements), even if the
>> problem at hand with the temporaries were fixed. What I am trying to
>> say is that it might be a nice feature to be able to override the
>> default behavior and deactivate eval-before-nesting.
>
> So, reintroduce .lazy() as an advanced feature for users who know what
> they're doing. Good idea, it doesn't hurt as long as we explain in the
> documentation that users should first try noalias() as it is in most
> cases what they want.
>
> But let's also say that
>
>>  double p = (P.transpose * P).lazy().diagonal().sum();
>
> can equivalently (and even more efficiently) be done as:
>
>    double p = P.array().abs2().sum();
>
> And this is not an isolated special case. The "diagonal in a product"
> and more generally "some coefficients in a product" expressions
> inherently simplify as expressions in terms of the coefficients.
>
>> Finally, I think I need to update the 3.0 TODO list. Personally, I
>> consider this whole problem a blocking, high-priority issue for 3.0.
>
> I think we all agree on this. Good idea, and anyway I don't think we
> can forget about it now :)
>
> Benoit
>
>>
>> Cheers,
>> - Hauke
>>
>> On Sun, Feb 7, 2010 at 10:26 AM, Gael Guennebaud
>> <gael.guennebaud@xxxxxxxxx> wrote:
>>> also note that the nest everything by reference approach we have in Eigen
>>> 2.0 also leads to unnecessary copies:
>>>
>>> Indeed since (a*b).adjoint() is compiled as:
>>>
>>> (a*b).conjugate().nesteByValue().transpose()
>>>
>>> the result of a*b gets copied once from the Conjugate to the Transpose
>>> expression.
>>>
>>> Finaly, let emphasize again that with either the current devel approach or
>>> the evaluator I described in the previous email, dynamic sized matrices can
>>> be easily worked-around using kind of shared pointers or a list of
>>> temporaries respectively. So the discussion might be focused on fixed size
>>> objects.
>>>
>>> gael
>>
>>
>>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/