Re: [eigen] Issues regarding Quaternion-alignment and const Maps

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Benoit Jacob wrote:
2010/7/9 Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx>:
Benoit Jacob wrote:
Wow, very good work.

I indeed confirm the 2x speed improvement, and once i moved your
benchmarking code to a non-inlinable function called from main(), it
even got a bit higher (indeed GCC fails to optimize correctly code in
the main() function).

Could you make a patch against the development branch? (We're not
going to add features to 2.0 at this point).
I think I can do that, but most likely not before Monday/Tuesday.

http://eigen.tuxfamily.org/index.php?title=Developer%27s_Corner#Generating_a_patch

Also, I didn't know about that loaddup instruction in SSE3. It's
great! I'll have a look at using it in ei_pset1 when SSE3 is
available.
It's actually a pity that there is no complete list with *just* all
SSE-instructions (not mixed with every other x86-instruction), including a
short description, maybe a usage example, and intrinsics for some common
compilers. At least I did't find any ...

Yes, I've been trying to see if there is a single-precision equivalent
for MOVDDUP and I still don't know...


I just searched every <*mmintrin.h> for float and found in <xmmintrin.h>:

/* Create a vector with all four elements equal to *P.  */
extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_load1_ps (float const *__P)
{
  return _mm_set1_ps (*__P);
}

but looking at _mm_set1_ps, it doesn't really look like this is actually an SSE instruction ...


--
----------------------------------------------
Dipl.-Inf. Christoph Hertzberg
Cartesium 0.051
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: (+49) 421-218-64252
----------------------------------------------



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/