[eigen] Eigen is slow with complex matrices

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

We're using eigen on a large astronomical project; it's great -- no need to find the proper combination du jour of optimised blas and lapack.

Unfortunately, in the context of complex matrices we are having to reconsider the use of BLAS. The reason appears to be in the complex class itself (see below) Is there any way that Eigen can work around this?


It occurs to me that you might be interested in the reason Eigen is slow for complex calculations compared to BLAS. It isn't really their fault. It's the fault of the implementers of the complex class in the std library.

One of Scott Meyer's recommendations for writing operator+ and operator+= is to always implement operator+ _in terms of_ operator +=. (Likewise for the other operators.)

In general, this is a good practice to follow. It helps make sure that the two operations are consistent with each other, and so avoids unexpected behavior.

However, it turns out that multiplying to complex numbers is a case where the opposite is true. operator*= should instead be implemented in terms of operator*. Look at the common implementation of the two functions in the standard library:

complex<_Tp>::operator*=(const complex<_Up>& __z)
   const _Tp __r = _M_real * __z.real() - _M_imag * __z.imag();
   _M_imag = _M_real * __z.imag() + _M_imag * __z.real();
   _M_real = __r;
   return *this;

inline complex<_Tp>
operator*(const complex<_Tp>& __x, const complex<_Tp>& __y)
   complex<_Tp> __r = __x;
   __r *= __y;
   return __r;

The implementation for *= is probably fine for itself. But doing * in terms of it requires an extra temporary, and, more importantly, it moves the memory around in a way that no compiler I've tried has been able to pipeline effectively.

A better implementation would be to have op* directly calculate both components of the result, and return it as a new complex object:
   return complex<_Tp>(
       _x.real() * _y.real() - _x.imag() * _y.imag(),
       _x.real() * _y.imag() + _x.imag() * _y.real());

If you wanted to follow the spirit of Scott Meyer's rule, you could then implement op*= in terms of op*:
{ *this = *this * _z; return *this; }

But I haven't found any stl implementation that does it this way.

Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/