Re: [eigen] Proposal: optimal return by value

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

Sorry for the long delay in replying, I'm still in math-land...

Your idea is very virtuoso and indeed a very nice solution. Certainly
useful. The only risk is to add another layer of template abstraction,
so I think this must be done on a case-by-case basis, where it's worth
it. I think it's worth implementing at least for a few funcs (product,
or inverse(), ...) and see how it works out. I agree to make it a
requirement to honor the eval-before-nesting flag (since as you say it
already is for product). For the rest, only experience can tell, and
by now you don't need me to agree before you do a big change ;) So do
as you feel is right.

Thanks,
Benoit

2009/2/27 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> Hi,
>
> many functions cannot be written as an expression and/or need actual storage
> for the result to perform the computation. Such functions include large
> product, cross, unitOrthogonal, inverse, and many others... For most of them
> we currently simply return a plain matrix by value that is not very
> efficient because of the useless memory alloc and copy. The useless copy can
> be avoided using swap() or by automatically calling swap() by "tagging" the
> return type...
>
> However, this solution still suffer from three limitations:
> 1 - useless memory alloc
> 2 - only works if the return type is exactly the same than the destination
> type
> 3 - sometimes the optimal compuation might depend on the target (e.g.,
> storage order)
>
> Let's foo be such a function returning a plain vector by value, for
> instance:
>
> mat.col(i) = foo(...);
>
> could not be optimized this way.
>
> So what you really want is to give "mat.col(i)" to the foo function. One
> solution is a less elegent C like API:
>
> foo_cstyle(..., mat.col(i));
>
> This might be ok sometime, but here is a proposal to get the advantage of
> both methods (ease of use + efficiency). The drawback will be a little more
> effort to write the function foo.
>
> So the idea is to add a generic template wrapper class allowing to
> automatically call the efficient "foo_cstyle" function:
>
> template<typename Func>
> struct wrapper
> {
>   inline wrapper(const Func& f) : func(f) {}
>   Func func;
> };
>
> and then, in MatrixBase (and Matrix) we add the respective operator=:
>
> template<typename Derived>
> template<typename Func>
> Derived& MatrixBase<Derived>::operator=(const ei_recall<Func>& other)
> {
>   other.func(derived());
>   return derived();
> }
>
>
> Finally, foo can be implemented like that:
>
> struct foo_impl
> {
>   template<typename Derived>
>   inline void operator() (MatrixBase<Derived>& res) const
>   {
>
>   }
>
>   Param1 p1;
>   Param2 p2;
>   ....
> };
>
> typedef Matrix<double,1000,1> BMatrix;
>
> wrapper<foo_impl> foo(Param1 p1, Param2, p2, ....)
> {
>   foo_impl res;
>   res.p1 = p1;
>   res.p2 = p2;
>   ...
>   return res;
> }
>
>
> That's it ! Well, not exactly because now what about:
>
> mat.col(i) += 2 * foo(...);
>
> Here the solution is to make wrapper inherit MatrixBase with
> EvalBeforeNestingFlags and a default plain matrix as the nesting type (and
> no coeff() functions to prevent wrong use). The implementation of wrapper
> would become:
>
> template<typename Func, typename EvalType>
> struct ei_traits<wrapper<Func,EvalType> > : ei_traits<EvalType>
> {
>   enum {
>     Flags = ei_traits<EvalType>::Flags | EvalBeforeNestingBit
>   };
> };
>
> template<typename Func, typename EvalType> struct wrapper : public
> MatrixBase<wrapper<Func,EvalType> >
> {
>   EIGEN_GENERIC_PUBLIC_INTERFACE(wrapper)
>   inline wrapper(const Func& f) : m_func(f) {}
>   Func m_func;
> };
>
> here EvalType correspond to the return type of our initial function foo.
> This will only work if we make sure EvalBeforeNestingBit is honored
> everywhere. In particular that means a user function taking as argument a
> generic MatrixBase<Derived> will have to copy the parameter to a
> Derived::Nested local variable. Note that we already have this requirement
> for Product, before if you don't do that while the input is a (large)
> product, then you will have a big performance issue !
>
>
> so,  very useless ? useless ? why not ? useful ? very useful ?
>
> gael.
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/