Re: [eigen]

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


This is surprising. I didn't know that the compiler would pad
everything to the next highest multiple of 4. Even if one aligns a
float3 to 16 bytes, the last 4 bytes should be usable on stack for
other variables. It's not like you typedef'ed a float3 and aligned it.
Then if you declare an array of such type, I can understand that
compiler will pad the w component to make all elements of that array
aligned.

Then why this behavior?

On Mon, Oct 12, 2009 at 9:17 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2009/10/12 Rohit Garg <rpg.314@xxxxxxxxx>:
>> I am not sure regarding the round up of mem size bit. Why can't you
>> have a float3 array aligned at 16 byte boundary?
>
> Because that makes its sizeof() increase to 16 bytes instead of 12, so
> that incurs a +33% memory overhead for everybody when you allocate an
> array of N such vectors --- and even so, this isn't quite optimal,
> e.g. this doesn't give you a very easy way to vectorize the product of
> 3x3 matrices with 3-vectors. For Vector5f, the situation is different,
> the vectorization can always work (you can always fit a packet) but
> the memory overhead is even bigger: 32/20 = 1.6 so it's a +60% memory
> overhead.
>
> Have a look at the example program (attached):
>
> #include<iostream>
>
> template<int N> struct foo
> {
>  __attribute__((aligned(16))) float f[N];
>  foo()
>  {
>    std::cout << "sizeof(foo<" << N << ">) is " << sizeof(foo)
>              << " instead of " << N*sizeof(float) << std::endl;
>  }
> };
>
> int main()
> {
>  foo<3>();
>  foo<5>();
> }
>
>
> Output:
>
> $ g++ sizeof_aligned.cpp -o s && ./s
> sizeof(foo<3>) is 16 instead of 12
> sizeof(foo<5>) is 32 instead of 20
>
>
> Cheers,
> Benoit
>



-- 
Rohit Garg

http://rpg-314.blogspot.com/

Senior Undergraduate
Department of Physics
Indian Institute of Technology
Bombay



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/