Re: [eigen] Expression Template Question

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Yep, that did the trick. The 'virtual' was a leftover from me doing it the naive way the first time.

Thanks. Oh, and I didn't know about the publications page. I'll look into it!

Dalon

On 10/21/2014 01:08 AM, Gael Guennebaud wrote:
This is because you declared operator[] as virtual. You have to rely to static polymorphism to enable inlining and full optimization, and actually this is already what you are doing there:

static_cast<OtherDerived const &>(B).operator[](i)

Adding a derived() method wrapping the static_cast will make the code a bit easier to read.

btw, I guess you already saw that we some materials there: http://eigen.tuxfamily.org/index.php?title=Publications explaining expr. template. the demo.tgz file include a self contained and as small as possible ET example.

cheers,
Gael

On Tue, Oct 21, 2014 at 4:30 AM, Dalon Work <dwwork@xxxxxxxxx> wrote:
I'm teaching myself the principles of how Eigen works by writing a small series of classes to copy its easier features, and have run into an issue that I'm not sure how to resolve. I am hoping for a little help understanding what is happening, and thought I should ask the experts.

I have an ArrayBase class defined as:

template <typename T, typename Derived> class ArrayBase;

which contains the following assignment operator:

template<typename OtherDerived>
inline const Derived& operator = (const ArrayBase<T,OtherDerived> &B)
{
   for(int i=0;i<_size;i++){
      static_cast<Derived*>(this)->operator[](i) =
        static_cast<OtherDerived const &>(B).operator[](i);
    }
    return static_cast<Derived const &>(*this);
  }


and addition operator:

template <typename OtherDerived>
inline const AA_Add<T,Derived,OtherDerived > operator + (const OtherDerived &rhs) const

where the _expression_ template class is defined as:

template<typename T,typename Lhs,typename Rhs>
class AA_Add : public ArrayBase<T,AA_Add<T,Lhs,Rhs> >
{
  protected:
  const Lhs &_lhs;
  const Rhs &_rhs;

  public:
  inline AA_Add(const Lhs &lhs,const Rhs &rhs): _lhs(lhs), _rhs(rhs)
  {
    this->_size = rhs.size();
  }
  inline AA_Add(const AA_Add &A) : _lhs(A._lhs), _rhs(A._rhs){
    this->_size = A._size;
  }

  inline virtual const T operator [] (const unsigned &i) const{
    return _lhs[i] + _rhs[i];
  }
};

The derived class is defined as:

template <typename T> class Array1d<T> : public ArrayBase<T,Array1d<T> >

And has assignment operator:

template<typename OtherDerived>
inline const Derived& operator = (const ArrayBase<T,OtherDerived> &B){
    return ArrayBase<T,Derived >::operator = (B);
   }

and an indexing operator, similar to the addition class:
inline virtual const T operator [] (const unsigned &i) const
{
  return this->_A[i];
}

Sorry for the large chunks of code, but I wanted to include all the important features here. The issue is not with the code per say, because it actually works, and allows for chaining of addition operators through the [] operator. The problem is is that it doesn't seem to be inlining anything. I was expecting to get a loop like

for(int i=0;i<_size;i++){
 A[i] = B[i] + C[i] + D[i] + ...;
}

Not knowing how else to check that it's working, I benchmarked the two, to find to my great surprise that the operator overloading leads to runtimes ~10 x as long as the above loop, that's with g++ and the -O3 flag.

I am not sure why this would be happening. Obviously the layered function calls are not getting stripped away, and I wonder if it is because I am using the [] operator (Eigen is hard to get through, but it doesn't look like it uses it) , or if it is because I'm returning the Addition class by value, or if I have stripped too much of the necessary complexity away.

Thanks in advance for your help.

Dalon















Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/