It occurs to me that you might be interested in the reason Eigen is
slow for complex calculations compared to BLAS. It isn't really
their fault. It's the fault of the implementers of the complex
class in the std library.
One of Scott Meyer's recommendations for writing operator+ and
operator+= is to always implement operator+ _in terms of_ operator
+=. (Likewise for the other operators.)
In general, this is a good practice to follow. It helps make sure
that the two operations are consistent with each other, and so
avoids unexpected behavior.
However, it turns out that multiplying to complex numbers is a case
where the opposite is true. operator*= should instead be
implemented in terms of operator*. Look at the common
implementation of the two functions in the standard library:
complex<_Tp>&
complex<_Tp>::operator*=(const complex<_Up>& __z)
{
const _Tp __r = _M_real * __z.real() - _M_imag * __z.imag();
_M_imag = _M_real * __z.imag() + _M_imag * __z.real();
_M_real = __r;
return *this;
}
inline complex<_Tp>
operator*(const complex<_Tp>& __x, const complex<_Tp>& __y)
{
complex<_Tp> __r = __x;
__r *= __y;
return __r;
}
The implementation for *= is probably fine for itself. But doing *
in terms of it requires an extra temporary, and, more importantly,
it moves the memory around in a way that no compiler I've tried has
been able to pipeline effectively.
A better implementation would be to have op* directly calculate both
components of the result, and return it as a new complex object:
{
return complex<_Tp>(
_x.real() * _y.real() - _x.imag() * _y.imag(),
_x.real() * _y.imag() + _x.imag() * _y.real());
}
If you wanted to follow the spirit of Scott Meyer's rule, you could
then implement op*= in terms of op*:
{ *this = *this * _z; return *this; }
But I haven't found any stl implementation that does it this way.