Re: [eigen] SSE questions

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2010/2/1 Radu Bogdan Rusu <rusu@xxxxxxxxxxxxxxxx>:
> Hi all,
>
> I have a few questions regarding the use of SSE instructions in the Eigen
> 2.x branch (2.0.11 to be more exact). I've looked at the generated assembly
> for some of them, but I just want to double check this with the Eigen
> developers.
>
> 1) Why isn't a Vector4f constructor converted into an _mm_set_ps on an SSE
> platform? Looking through Core/arch/SSE, I did not find any reference to
> _mm_set_ps.

Good question. For now, the Vector4f constructor taking 4 coordinates
indeed copies them without SSE. Indeed, _mm_set_ps is what we need
here. I understand that it could give a real improvement when the
Vector4f thus constructed is used right away in an expression. Patches
welcome :)

>
> 2) Is there any interest in having a specialized 3x3 covariance matrix
> estimation method for the SSE case?

At this stage I wouldn't do such heavy changes in 2.0, but we can
discuss this for the development branch. I'm not sure how you would
work around the alignment issues at runtime. By copying the matrix
into a temporary 4x4 matrix?

> Not sure how many people use this, but
> using _mm_shuffle_ps and a few pointer tricks reduced the number of
> operations drastically.
>
> 3) What is the status of MapAligned in 2.x? Looking at Matrix.h, I see:
> 429       * \warning Do not use MapAligned in the Eigen 2.0. Mapping aligned
> arrays will be fully
> 430       * supported in Eigen 3.0 (already implemented in the development
> branch)
>
> which makes me believe it's not officially supported? Empirically I found
> out that it works well.

I'd say not officially supported... have to admit that I don't
remember if it works or not :/ If that matters to you, the best you
can do is send us a patch against the unit-test map.cpp. If it's
tested, then it can be supported.

> 4) Is this the recommended optimized way to get a dot product between a
> VectorXf and a Vector4f ?
>
> float d = ((Eigen::Vector4f)my_vectorxf).start<4>().dot (my_vector4f);

Ouch! Why are you casting my_vectorxf to Vector4f type?

Just do:

my_vectorxf.start<4>().dot(my_vector4f)


Benoit



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/