Re: [eigen] nesting

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


I might have an idea to fix the issue without introducing a complex topdown evaluator.

Let's first consider dynamic sized matrices only. For them we could create a Temporary class inheriting DenseMatrixBase which would behave like a shared pointer with automatic deletion. So instead of copying the whole matrix data, you would only copy the sizes, the pointer and increment a counter. Get it?

Now, of course this concept does not work with fixed size objetcs. For them you would have to nest them in a kind of shared-pointer-_expression_, and so create the matrix objects using operator new... that is even worse :(

So unless someone see how to solve this latter issue, what Hauke is trying to do could work. But instead of specialzing ei_ref_selector for all expressions I would suggest a different approach using new bit Flags. The rules would be as follow:
- if the expr has the NestByReference flag, then nest by reference, otherwise nest by value
- by default, only Matrix and Array would have the NestByReference flag
- the NestByReference flag is not inherited
- an _expression_ nesting a temporary would have the NestParentByReference flag
- the NestParentByReference flag is not inherited, but if one of the children of an _expression_ has the NestParentByReference flag, then the current _expression_ will have the NestByReference flag. If we make sure that NestParentByReferenceBit == NestByReference>>1, then we simply have to do: Flags = ... | ((Lhs::Flags|Rhs::Flags)&NestParentByReference) << 1), or we could also use a macro: Flags = ... | EIGEN_PROPAGATE_NESTING_BIT(Lhs::Flags|Rhs::Flags).

As an implication, we could get rid of ei_ref_selector. Everything will be done by ei_nesting and the ei_traits, so fewer template instanciations.

To detect if a child generates a temporary, ei_nesting could expose an Evaluate boolean.

gael.

On Thu, Feb 4, 2010 at 11:35 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
2010/2/4 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> On Thu, Feb 4, 2010 at 11:22 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> 2010/2/4 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>>> On Thu, Feb 4, 2010 at 11:11 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>> Just one thing that I don't follow:
>>>>
>>>> 2010/2/4 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>>>>> Our real problem is the following:
>>>>>
>>>>> A*B + A1 + A2 + A3 + A4 + A5
>>>>>
>>>>> creates 5 temporary matrices, the result of A*B is copied 4 times....
>>>>
>>>> Why is it so? After A*B has evaluated into a temporary matrix, isn't
>>>> it the same as
>>>>
>>>> tmp + A1 + A2 + A3 + A4 + A5
>>>>
>>>> ? That doesn't evaluate at every step ...!
>>>
>>> this is the problem that Hauke is talking about: since we nest by
>>> value, tmp is stored in the _expression_ of tmp + A1, so sizeof(tmp+A1)
>>> = big, so does (tmp+A1)+A2, etc. I did not check but I think that the
>>> way it works.
>>>
>>> on the other hand if you write:
>>>
>>> (A*B).eval() + A1 + A2 + A3 + A4 + A5
>>>
>>> then it is fine fine because the temporary is explicitly created on
>>> the stack and stored by reference by the binary expressions...
>>
>> Ah OK and can't we make an exception to the rule that we nest
>> expressions by value? Couldn't we say: we nest by value EXCEPT plain
>> matrices which we nest by reference? Then nesting by value an
>> _expression_ referring it would just copy a reference.
>
> we cannot store implicit temporaries by reference, it has to live
> throughout the _expression_, and so it has to be stored by the returned
> _expression_.

ok, some day I should learn to be able to decide that myself... :(

then I'm starting to understand how the change that you are proposing
here is an inevitable consequence of the new nest-by-value design. I'm
crossing fingers that it turns out to be a change for the better!
(that is, not too bad compilation times).

Benoit


>
> gael.
>
>>
>> I usually say stupid things when I'm talking about references and
>> lifetime of temporaries :)
>>
>> Benoit
>>
>>
>>
>>>
>>> gael
>>>
>>>
>>>> Benoit
>>>>
>>>>>
>>>>> gael
>>>>>
>>>>>>
>>>>>> Am I missing something? I am especially afraid of being missing
>>>>>> something about the blas_traits and how you implemented that stuff ---
>>>>>> you know better than me.
>>>>>>
>>>>>> Benoit
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Such an analyzer/evaluator would look like the current ei_blas_traits...
>>>>>>> Some examples of what could be done with such an approach:
>>>>>>>
>>>>>>> (A + B).block() => (A.block() + B.block())
>>>>>>>
>>>>>>> E.noalias() += A*B + C*D;
>>>>>>>
>>>>>>> =>
>>>>>>>
>>>>>>> E.noalias() += A*B;
>>>>>>> E.noalias() += C*D;
>>>>>>>
>>>>>>> This also offers more parallelization opportunities.
>>>>>>>
>>>>>>> Sounds good, but of course I'm really scared about compilation times... This
>>>>>>> is why I did not talk that much about that idea so far.
>>>>>>>
>>>>>>> gael.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 4, 2010 at 2:35 PM, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> While looking into the "performance degradation" issue from the forum
>>>>>>>> I found out that it is due to temporaries - as Benoit already guessed.
>>>>>>>>
>>>>>>>> I am a little bit afraid, that what I once proposed, namely copying
>>>>>>>> expressions by value, is now backfiring. The reason is that initially
>>>>>>>> I assumed expressions to be tiny little objects with close to no copy
>>>>>>>> costs. The issue is related to those expressions holding temporaries.
>>>>>>>> Copying them (e.g. a product _expression_) means copying all the data
>>>>>>>> including the temporary and that will happen as many times as we nest
>>>>>>>> expressions.
>>>>>>>>
>>>>>>>> The only solution I can think about at the moment is the
>>>>>>>> specialization of ei_nested for those types and to go back to nesting
>>>>>>>> by reference for these heavy weight guys.
>>>>>>>>
>>>>>>>> - Hauke
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>







Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/