[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] nesting
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Thu, 4 Feb 2010 14:03:23 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=+oS8wD3FRWDBt0X9hQRJb1Q66N0dO/IFX3JKskB5LUA=; b=NsUHogS5U/TnVmC3RgFBcmMRZQu1Ks7DeysN0w0XFaXneapTC7SsG5wwzt9/cwPoH5 hJTCIts8VhIqegZbRdwI7o6olMwF7awPga/gv/jB5OXV1Zjm4fzdnyZfP0pPj5QyC+6U uWM6pzFx1hKtPUnP+Ac978qTHCa69Q5gDYm2k=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=anmFRPZQj1Z7nndb8v2/PFBDRSHQFtAHo6R9o3K7vELsUQm8n9InU+LWaHYo2am0pT UwjErQwxHquzhkxkxkleBlfqYmzIKxNtDCJBe19V6Hij07rKLmBrIS6ahYXGgdcGL/bj wkDI/512ebdiYtIS3Kzt9AGQyhxcnX1cj8Bzg=
2010/2/4 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> arf indeed, anytime an expression is evaluated at nesting time, a matrix
> gets stored by value and potentially it can be copied multiple times... I
> think this is one more argument in favor of a top-down evaluation mechanism.
> In short, the idea behind this is:
> 1 - remove all evaluation at nesting time, and simply build the complete
> expression tree
> 2 - at evaluation time, you send your complete expression to a template
> evaluator which will recursively analyze your expression from the top,
> reorder some subexpression, and evaluate some sub expressions when needed.
As you say below, the main concern with this approach is compilation
times. With this approach, when we see
A * B * C * D
we really construct a deeply nested product of product of...
By contrast, with the current approach, since we evaluate right away
before continuing constructing the expression tree. So all we are
dealing with is products of matrices. Thus, this can limit a lot the
number of different template evaluations that we make. I regard this
as a quite nifty optimization and would like to retain it!
The idea that you propose would indeed be necessary if it were
difficult to know where to introduce temporaries before the whole
expression is known. But is that the case? When we have an expression
A + B*C
then it seems to me that the logic is pretty simple: either B*C is an
outerproduct, in which case we should have neither EvalBeforeNesting
nor EvalBeforeAssigning, or it is NOT an outer product, in which case
we should have the 2 bits.
Am I missing something? I am especially afraid of being missing
something about the blas_traits and how you implemented that stuff ---
you know better than me.
> Such an analyzer/evaluator would look like the current ei_blas_traits...
> Some examples of what could be done with such an approach:
> (A + B).block() => (A.block() + B.block())
> E.noalias() += A*B + C*D;
> E.noalias() += A*B;
> E.noalias() += C*D;
> This also offers more parallelization opportunities.
> Sounds good, but of course I'm really scared about compilation times... This
> is why I did not talk that much about that idea so far.
> On Thu, Feb 4, 2010 at 2:35 PM, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>
>> While looking into the "performance degradation" issue from the forum
>> I found out that it is due to temporaries - as Benoit already guessed.
>> I am a little bit afraid, that what I once proposed, namely copying
>> expressions by value, is now backfiring. The reason is that initially
>> I assumed expressions to be tiny little objects with close to no copy
>> costs. The issue is related to those expressions holding temporaries.
>> Copying them (e.g. a product expression) means copying all the data
>> including the temporary and that will happen as many times as we nest
>> The only solution I can think about at the moment is the
>> specialization of ei_nested for those types and to go back to nesting
>> by reference for these heavy weight guys.
>> - Hauke