Re: [eigen] nesting |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] nesting*From*: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>*Date*: Thu, 4 Feb 2010 23:05:20 +0100*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=Eyd/oDZ5EUu4u2gVftoMB8md8/axQMambdKZNjisMHU=; b=gE9DzIA3UcWpPZKiqejxnAzji1F2ADJY1Exxz0MlkkTmoIsTlcMF5pcu7J3EUj/b1E HFrzDNbannrYNVXd0ZwjA4uig5//ny6maZuuLkgjNR5fwp/WowppgDiUSIeI2voSI4uJ JWe/bri+3LhUAAHlWOpm6rrpu+7zmF1Ffn3tI=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=WtYmKzVBXCSrMvzzUVY80os8rb1KWCircBalx5qdkb8wOfN1ndcQfYyfMJbgvPoq40 jMuJgC+SFLb7TXcIboCjU59aajydb4QF7unMDbBgir/wD5bU1/JSMlOzzDchdDDxqQtz blleOsMn/1X5I++tuWKktFva97+HW7rS3dgvU=

On Thu, Feb 4, 2010 at 8:03 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote: > 2010/2/4 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>: >> >> arf indeed, anytime an expression is evaluated at nesting time, a matrix >> gets stored by value and potentially it can be copied multiple times... I >> think this is one more argument in favor of a top-down evaluation mechanism. >> In short, the idea behind this is: >> >> 1 - remove all evaluation at nesting time, and simply build the complete >> expression tree >> 2 - at evaluation time, you send your complete expression to a template >> evaluator which will recursively analyze your expression from the top, >> reorder some subexpression, and evaluate some sub expressions when needed. > > As you say below, the main concern with this approach is compilation > times. With this approach, when we see > > A * B * C * D > > we really construct a deeply nested product of product of... > > By contrast, with the current approach, since we evaluate right away > before continuing constructing the expression tree. So all we are > dealing with is products of matrices. Thus, this can limit a lot the > number of different template evaluations that we make. I regard this > as a quite nifty optimization and would like to retain it! > > The idea that you propose would indeed be necessary if it were > difficult to know where to introduce temporaries before the whole > expression is known. But is that the case? yes I think there are many cases where temporaries or useless evaluations could be avoided, again: R = A*B + C*D; or (A*B).block() or (A*B + C).block() or R = s*(A*B +C*D) (2 temporaries initialized to 0+ 2 calls to gemm + one scalar product) => R = s*A*B; R+=s*C*D; (only 2 calls to gemm) > When we have an expression > like: > > A + B*C > > then it seems to me that the logic is pretty simple: either B*C is an > outerproduct, in which case we should have neither EvalBeforeNesting > nor EvalBeforeAssigning, or it is NOT an outer product, in which case > we should have the 2 bits. that's more complicated: - large outer products must keep EvalBeforeNesting for performance reasons - removing the EvalBeforeNesting bit here has a very very low impact on the performances. Our real problem is the following: A*B + A1 + A2 + A3 + A4 + A5 creates 5 temporary matrices, the result of A*B is copied 4 times.... gael > > Am I missing something? I am especially afraid of being missing > something about the blas_traits and how you implemented that stuff --- > you know better than me. > > Benoit > > > >> Such an analyzer/evaluator would look like the current ei_blas_traits... >> Some examples of what could be done with such an approach: >> >> (A + B).block() => (A.block() + B.block()) >> >> E.noalias() += A*B + C*D; >> >> => >> >> E.noalias() += A*B; >> E.noalias() += C*D; >> >> This also offers more parallelization opportunities. >> >> Sounds good, but of course I'm really scared about compilation times... This >> is why I did not talk that much about that idea so far. >> >> gael. >> >> >> On Thu, Feb 4, 2010 at 2:35 PM, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx> >> wrote: >>> >>> Hi, >>> >>> While looking into the "performance degradation" issue from the forum >>> I found out that it is due to temporaries - as Benoit already guessed. >>> >>> I am a little bit afraid, that what I once proposed, namely copying >>> expressions by value, is now backfiring. The reason is that initially >>> I assumed expressions to be tiny little objects with close to no copy >>> costs. The issue is related to those expressions holding temporaries. >>> Copying them (e.g. a product expression) means copying all the data >>> including the temporary and that will happen as many times as we nest >>> expressions. >>> >>> The only solution I can think about at the moment is the >>> specialization of ei_nested for those types and to go back to nesting >>> by reference for these heavy weight guys. >>> >>> - Hauke >> >> >> > > >

**Follow-Ups**:**Re: [eigen] nesting***From:*Benoit Jacob

**Re: [eigen] nesting***From:*Benoit Jacob

**Re: [eigen] nesting***From:*Benoit Jacob

**Re: [eigen] nesting***From:*Hauke Heibel

**References**:**[eigen] nesting***From:*Hauke Heibel

**Re: [eigen] nesting***From:*Gael Guennebaud

**Re: [eigen] nesting***From:*Benoit Jacob

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] nesting** - Next by Date:
**Re: [eigen] nesting** - Previous by thread:
**Re: [eigen] nesting** - Next by thread:
**Re: [eigen] nesting**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |