Re: [eigen]

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen]
From: Rohit Garg <rpg.314@xxxxxxxxx>
Date: Mon, 12 Oct 2009 21:45:17 +0530
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=uUceBMWxqN3JoYWtpUHyxJ5AEjUqzWYhYaR0q2iXH9s=; b=VUPslRjGZE7m0+lmU4DA0m4xoRRIq9wVCQd/iOLmEKQwvbv1+ZrB2zuWf2EtYQ1ILS muhszPqRHw8XFShkJNnumCVuXEkDXFlJjN3PcY9TFfuzrUCH6khXc+STeFZ8NQUy//s2 NKCKMuCTYHIWrefqsjvBq5niNX5ezC0BlSLbE=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=mTEqBWSCkAy54+RmOm/8Ip1vGsYkM27tmrqRH352ninh9uUboe0kfINO6s04W89cZ2 8j/TKskn3KNM1NA0pmp7txGhbGRvwhMtIx4o7HnfyeXOtZx3XuxP665cABwmNc9SOAXY 96h3g00yRtV/7x7YxIOWZFfH6N2MZyCcSC5+E=

This is surprising. I didn't know that the compiler would pad
everything to the next highest multiple of 4. Even if one aligns a
float3 to 16 bytes, the last 4 bytes should be usable on stack for
other variables. It's not like you typedef'ed a float3 and aligned it.
Then if you declare an array of such type, I can understand that
compiler will pad the w component to make all elements of that array
aligned.

Then why this behavior?

On Mon, Oct 12, 2009 at 9:17 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2009/10/12 Rohit Garg <rpg.314@xxxxxxxxx>:
>> I am not sure regarding the round up of mem size bit. Why can't you
>> have a float3 array aligned at 16 byte boundary?
>
> Because that makes its sizeof() increase to 16 bytes instead of 12, so
> that incurs a +33% memory overhead for everybody when you allocate an
> array of N such vectors --- and even so, this isn't quite optimal,
> e.g. this doesn't give you a very easy way to vectorize the product of
> 3x3 matrices with 3-vectors. For Vector5f, the situation is different,
> the vectorization can always work (you can always fit a packet) but
> the memory overhead is even bigger: 32/20 = 1.6 so it's a +60% memory
> overhead.
>
> Have a look at the example program (attached):
>
> #include<iostream>
>
> template<int N> struct foo
> {
>  __attribute__((aligned(16))) float f[N];
>  foo()
>  {
>    std::cout << "sizeof(foo<" << N << ">) is " << sizeof(foo)
>              << " instead of " << N*sizeof(float) << std::endl;
>  }
> };
>
> int main()
> {
>  foo<3>();
>  foo<5>();
> }
>
>
> Output:
>
> $ g++ sizeof_aligned.cpp -o s && ./s
> sizeof(foo<3>) is 16 instead of 12
> sizeof(foo<5>) is 32 instead of 20
>
>
> Cheers,
> Benoit
>



-- 
Rohit Garg

http://rpg-314.blogspot.com/

Senior Undergraduate
Department of Physics
Indian Institute of Technology
Bombay

Follow-Ups:
- Re: [eigen]
  - From: Benoit Jacob

References:
- [eigen]
  - From: dilas dilas
- Re: [eigen]
  - From: Benoit Jacob
- Re: [eigen]
  - From: Rohit Garg
- Re: [eigen]
  - From: Benoit Jacob

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen]
Next by Date: Re: [eigen]
Previous by thread: Re: [eigen]
Next by thread: Re: [eigen]

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/