Re: [eigen]

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2009/10/11 dilas dilas <espiritusantu@xxxxxxx>:
> Hello. I think Eigen to be really good library, but I've found some problem, which I can't to live with. :)
> I employ x86 + SSE2 platform and Visual C++
> When I use matrices with fixed sizes, which are not multiple of the packet size Eigen doesn't use vector instructions.
> I tried to enforce it by using AutoAlign option, but it didn't worked.
> for example for:
>
> Matrix<float,5,1,AutoAlign | ColMajor> a;
> Matrix<float,5,1,AutoAlign | ColMajor> b;
> Matrix<float,5,1,AutoAlign | ColMajor> c = a.cwise()  * b;
>
> Visual C++ generates five mulss-instructions, but I want one mulps and one mulss.
>
> So I've decided to round up first dimention to multiple of 4. And it worked. But in some cases, for example when we multiply 3x3 Matrix by 3x1 one this doesn't work, and we must round up second dimention either, but it's not on.
>
> So I have two questions:
> 1. Do developers intend to reach behavior, which I've declared?

In order to efficiently vectorize a Vector5f, we would have to align
its array to 16-byte boundary. Which would make sizeof(Vector5f) grow
from 20 to 32 bytes. Which would mean that if you allocate an array of
N Vector5f's, the memory usage grows from 20N to 32N bytes. That's why
we will never make that the default behavior. But yes, it would be
nice to have that as a non-default option. Note that in the
development branch, in unsupported/, we already have a AlignedVector3
class that does something comparable that for vectors of size 3,
though that is a little different: here, it does it all with SIMD
instructions, ignoring the last component. That's all one can do for
Vector3f. For Vector5f, it is better to do as you suggest: 1 packet +
1 scalar. That at least works in all cases.

> 2. How can I myself change sources to reach this behavior?

- If you're happy about a quick hack like AlignedVector3, just check
its sources.
- If you want the real solution as you're suggesting, hm, there's a
bit of changes to make! Here are some starting points:
 -- in MatrixStorage.h, in ei_matrix_array, always make it align the array.
 -- in all the files that have meta-unrolled loops, well our
meta-unrollers aren't usable in your case, so your best bet is to not
use them, instead force the usage of non-unrolled paths, cross your
fingers that the compiler auto-unrolls in your case, or write new
meta-unrollers for these cases.

I'd be OK to add a new non-default matrix option ForceAlign in
addition to the existing AutoAlign (default) and DontAlign. Again the
bulk of the work will be to extend / write new unrolled loops for
Assign, for the products, etc...

Benoit



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/