Re: [eigen] Issues regarding Quaternion-alignment and const Maps

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Benoit Jacob wrote:
I have made a patch letting ei_pset1 use _mm_loaddup_pd when we have SSE3:

template<> EIGEN_STRONG_INLINE Packet2d ei_pset1<double>(const double&  from) {
#ifdef EIGEN_VECTORIZE_SSE3
  return _mm_loaddup_pd(&from);
#else
  Packet2d res = _mm_set_sd(from);
  return ei_vec2d_swizzle1(res, 0, 0);
#endif
}

But guess what? It's actually not faster (perhaps even a bit slower)
than our ei_vec2d_swizzle1!

So let's just forget about it.

Christoph, is  _mm_loaddup_pd the only SSE3 intrinsic your code is
using ? If yes, by using ei_pset1 instead of _mm_loaddup_pd, you can
make your code work on SSE2 !

I guess the most important SSE3 instruction is _mm_addsub_pd which adds the first and subtracts the second element. If there is a code which negates just one element, this could be replaced.

Googleing a bit implies that the SSE-way to do it is to XOR with
{-0.0, 0.0} (or the other way around). I will try that ...

Christoph

--
----------------------------------------------
Dipl.-Inf. Christoph Hertzberg
Cartesium 0.051
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: (+49) 421-218-64252
----------------------------------------------



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/