Re: [eigen]

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2009/10/12 Rohit Garg <rpg.314@xxxxxxxxx>:
> I am not sure regarding the round up of mem size bit. Why can't you
> have a float3 array aligned at 16 byte boundary?

Because that makes its sizeof() increase to 16 bytes instead of 12, so
that incurs a +33% memory overhead for everybody when you allocate an
array of N such vectors --- and even so, this isn't quite optimal,
e.g. this doesn't give you a very easy way to vectorize the product of
3x3 matrices with 3-vectors. For Vector5f, the situation is different,
the vectorization can always work (you can always fit a packet) but
the memory overhead is even bigger: 32/20 = 1.6 so it's a +60% memory
overhead.

Have a look at the example program (attached):

#include<iostream>

template<int N> struct foo
{
  __attribute__((aligned(16))) float f[N];
  foo()
  {
    std::cout << "sizeof(foo<" << N << ">) is " << sizeof(foo)
              << " instead of " << N*sizeof(float) << std::endl;
  }
};

int main()
{
  foo<3>();
  foo<5>();
}


Output:

$ g++ sizeof_aligned.cpp -o s && ./s
sizeof(foo<3>) is 16 instead of 12
sizeof(foo<5>) is 32 instead of 20


Cheers,
Benoit
#include<iostream>

template<int N> struct foo
{
  __attribute__((aligned(16))) float f[N];
  foo()
  {
    std::cout << "sizeof(foo<" << N << ">) is " << sizeof(foo)
              << " instead of " << N*sizeof(float) << std::endl;
  }
};

int main()
{
  foo<3>();
  foo<5>();
}


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/