[eigen] Proposal: optimal return by value

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

many functions cannot be written as an _expression_ and/or need actual storage for the result to perform the computation. Such functions include large product, cross, unitOrthogonal, inverse, and many others... For most of them we currently simply return a plain matrix by value that is not very efficient because of the useless memory alloc and copy. The useless copy can be avoided using swap() or by automatically calling swap() by "tagging" the return type...

However, this solution still suffer from three limitations:
1 - useless memory alloc
2 - only works if the return type is exactly the same than the destination type
3 - sometimes the optimal compuation might depend on the target (e.g., storage order)

Let's foo be such a function returning a plain vector by value, for instance:

mat.col(i) = foo(...);

could not be optimized this way.

So what you really want is to give "mat.col(i)" to the foo function. One solution is a less elegent C like API:

foo_cstyle(..., mat.col(i));

This might be ok sometime, but here is a proposal to get the advantage of both methods (ease of use + efficiency). The drawback will be a little more effort to write the function foo.

So the idea is to add a generic template wrapper class allowing to automatically call the efficient "foo_cstyle" function:

template<typename Func>
struct wrapper
{
  inline wrapper(const Func& f) : func(f) {}
  Func func;
};

and then, in MatrixBase (and Matrix) we add the respective operator=:

template<typename Derived>
template<typename Func>
Derived& MatrixBase<Derived>::operator=(const ei_recall<Func>& other)
{
  other.func(derived());
  return derived();
}


Finally, foo can be implemented like that:

struct foo_impl
{
  template<typename Derived>
  inline void operator() (MatrixBase<Derived>& res) const
  {
   
  }

  Param1 p1;
  Param2 p2;
  ....
};

typedef Matrix<double,1000,1> BMatrix;

wrapper<foo_impl> foo(Param1 p1, Param2, p2, ....)
{
  foo_impl res;
  res.p1 = p1;
  res.p2 = p2;
  ...
  return res;
}


That's it ! Well, not exactly because now what about:

mat.col(i) += 2 * foo(...);

Here the solution is to make wrapper inherit MatrixBase with EvalBeforeNestingFlags and a default plain matrix as the nesting type (and no coeff() functions to prevent wrong use). The implementation of wrapper would become:

template<typename Func, typename EvalType>
struct ei_traits<wrapper<Func,EvalType> > : ei_traits<EvalType>
{
  enum {
    Flags = ei_traits<EvalType>::Flags | EvalBeforeNestingBit
  };
};

template<typename Func, typename EvalType> struct wrapper : public MatrixBase<wrapper<Func,EvalType> >
{
  EIGEN_GENERIC_PUBLIC_INTERFACE(wrapper)
  inline wrapper(const Func& f) : m_func(f) {}
  Func m_func;
};

here EvalType correspond to the return type of our initial function foo. This will only work if we make sure EvalBeforeNestingBit is honored everywhere. In particular that means a user function taking as argument a generic MatrixBase<Derived> will have to copy the parameter to a Derived::Nested local variable. Note that we already have this requirement for Product, before if you don't do that while the input is a (large) product, then you will have a big performance issue !


so,  very useless ? useless ? why not ? useful ? very useful ?

gael.



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/