2009/8/15 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>: > On Sat, Aug 15, 2009 at 10:48 PM, Benoit Jacob<jacob.benoit.1@xxxxxxxxx> wrote: >> Mostly, indeed. Initially we didn't know to which expressions it would >> apply. Indeed it turns out that it's almost only product expressions. >> There are exceptions too: at least the Random expression (more >> generally any non-repeatable nullaryExpr, but afaik it's the only one >> so far). > > hm, right I forgot about that, but Random only have the > EvalBeforeNestingBit flags, so do we still need a way to remove that > flag ? If so .lazy() would be perfect but if we keep .lazy() for that > purpose we'll have strong backward compatibility issues. Unless we > overload .lazy() in CwiseNullaryOp... Oh I wasn't advocating keeping lazy(), even for that. Probably nobody cares that much about the performance of random matrices. The use case is very niche. If by now we haven't found a better use case, that probably means that it's OK to remove lazy(). >> I don't really understand the difference between c1 and c2, indeed >> lazy() used to do 2 things, remove EvalBeforeAssigning and >> EvalBeforeNesting flags, i guess that's what you meant. Anyway it >> doesn't matter. > > yes that's what I meant, and basically .lazy() would be good at > removing EvalBeforeNestingBit only (e.g., for Random) and .noalias() > to bypass EvalBeforeAssigning. Our mistake was to use .lazy() for > both. OK. You decide if you want to keep such a lazy() or remove it altogether (keeping it deprecated for now), I don't have a strong preference. >>> So to summary, I'd be in favor in removing .lazy(), replace the >>> EvalBeforeAssignBit flag by a MightAliasBit flag, and add a no-alias >>> mechanism on the result side. >> >> OK so a.noalias() = xpr would be the way to obtain what you called the >> "optimal" solution, right? > > hm, not really because: > > D.noalias() = A*B + C; > > it is the same as: > > D.lazyAssign( (A*B) + C); > > which is the same as: > > D = (A*B).eval() + C; > > But here you can avoid a temporary and still have an efficient > evaluation of the product by doing: > > D = C; > D.noalias() += A*B; > > (best for dynamic sizes) > > or: > > D.noalias() += A*B; > D += C; > > (best for small fixed sizes) Hm. Rather fascinating! And do you have a plan to make Eigen decide that automatically? >> But I have a question. since a+=b expands to a=a+b, as soon as b has >> the EvalBeforeNestingBit, there can be no aliasing problem. b will >> just be evaluated into a temporary c which won't have any >> EvalBefore... bit, and from there one we have a=a+c which will be >> lazy. On the other hand, noalias() allows to ignore >> EvalBeforeAssigning on the sum expression. So basically the only case >> where noalias() might be useful in a.noalias()+=b, is a case where b >> has EvalBeforeAssigning and NOT EvalBeforeNesting. I don't think >> there's any example of that??? So basically i'm saying that I don't >> see any use case for a.noalias()+=b. Do you see any? > > this is because we have overloads for += <product> and -= <product> so > that you can have: > > c += (a*b).lazy() > > oops, now you should rather write: > > c.noalias() += a*b; > > and evaluate it without temporary because this what the underlying > optimized routine does. I see. So you're saying that in this case, a.noalias()+=b is not equivalent to a.noalias()=a+b, it also has the effect of ignoring the EvalBeforeNesting bit on b? And that's not a general rule but rather a property of the underlying optimized routine? I'm OK with this behavior but I think that it needs to be a little more predictable. Perhaps it could be explicitly said that a.noalias()+=b assumes that there is no aliasing between a and b. In this way, the decision of evaluating b or not becomes purely an optimization issue and we're free to do what we want in the backyard :) Benoit > > And yes, if you write: > > c = c + a*b; > > then you lost this optimization, and a*b is evaluated into a temporary. > > However without noalias(): > > c += a*b; > > is the same as: > > c += (a*b).eval(); > > > > > Perhaps I should also say that: > > 2 * (a * b) > > actually returns a product expression, and not a CwiseUnaryOp because > the optimized routine performs this scaling anyway. > > ah, and another details: when you write c.noalias() = a*b (with c > large enough), it is actually evaluated as: > > c.setZero(); > c.noalias() += 1 * a * b; > > > Gael > > >

