Re: [eigen] Issues regarding Quaternion-alignment and const Maps |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
Benoit Jacob wrote:
I have made a patch letting ei_pset1 use _mm_loaddup_pd when we have SSE3: template<> EIGEN_STRONG_INLINE Packet2d ei_pset1<double>(const double& from) { #ifdef EIGEN_VECTORIZE_SSE3 return _mm_loaddup_pd(&from); #else Packet2d res = _mm_set_sd(from); return ei_vec2d_swizzle1(res, 0, 0); #endif } But guess what? It's actually not faster (perhaps even a bit slower) than our ei_vec2d_swizzle1! So let's just forget about it. Christoph, is _mm_loaddup_pd the only SSE3 intrinsic your code is using ? If yes, by using ei_pset1 instead of _mm_loaddup_pd, you can make your code work on SSE2 !
I guess the most important SSE3 instruction is _mm_addsub_pd which adds the first and subtracts the second element. If there is a code which negates just one element, this could be replaced.
Googleing a bit implies that the SSE-way to do it is to XOR with {-0.0, 0.0} (or the other way around). I will try that ... Christoph -- ---------------------------------------------- Dipl.-Inf. Christoph Hertzberg Cartesium 0.051 Universität Bremen Enrique-Schmidt-Straße 5 28359 Bremen Tel: (+49) 421-218-64252 ----------------------------------------------
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |