Re: [eigen] Struct padding with the nvcc compiler.

[ Thread Index | Date Index | More Archives ]

On 2016-06-02 19:38, Christoph Hertzberg wrote:
On 2016-06-02 18:57, Benoit Steiner wrote:
It should be possible to use something like
#ifdef __CUDA_ARCH__

__CUDA_ARCH__ is defined iff nvcc compiles a cuda kernel, so this won't
conflict with the host side (and therefore this won't be a problem if
compiling whost code with AVX enabled)

I think the problem Gael was referring to is that if the host code is
compiled with AVX (and therefore EIGEN_MAX_STATIC_ALIGN_BYTES=32), the
CUDA part should be compiled with 32 byte alignment as well. Otherwise,
you'll still get different padding.

Actually, this is the same problem as when linking object files compiled with AVX and SSE together. The user can workaround this by pre-defining EIGEN_MAX_STATIC_ALIGN_BYTES to either 16 or 32, but when relying on the default, this _could_ result in very hard to find errors (more or less depending on what you put inside your structures and where they are accessed).


On Thu, Jun 2, 2016 at 4:42 AM, Gael Guennebaud

Thanks, so that means that once passed to "cudacc", Eigen does not
try to
align at all. We could enable 16 bytes alignment by default, but this
still fail if compiled with AVX and 32 bytes alignment.

Does anyone see a way to automatically pass to the cuda part the
that the host part will be compiled with some given


On Thu, Jun 2, 2016 at 10:54 AM, Jon E. A. Lund <jonealund@xxxxxxxxx>

Hello again,

Both suggestions fixed the problem. They both work on their own, and
together. Thanks!

On Wed, Jun 1, 2016 at 9:55 PM, Gael Guennebaud <
gael.guennebaud@xxxxxxxxx> wrote:


there are two possibilities: either Eigen don't request alignment when
compiled with nvcc, or nvcc does not comply with our alignment
request. To
investigate the issue, you can try to #define
EIGEN_MAX_STATIC_ALIGN_BYTES=16 before any Eigen's header (or pass
it to
nvcc compiler). You can also try to add EIGEN_ALIGN_TO_BOUNDARY(16)
any Eigen's member required alignment, e.g.:

struct bar {
   Vector4f v1;
   float x, y;
   EIGEN_ALIGN_TO_BOUNDARY(16) Matrix4f m1;

Finally, you might also consider re-ordering your members to put the
largest object first so that no padding (and thus waste of memory)
will be

Let us know about the outcome!


On Tue, May 31, 2016 at 2:56 PM, Jon E. A. Lund <jonealund@xxxxxxxxx>


I'm using Eigen in a Cuda program, and want to use a Struct
multiple Eigen members. The problem I am having is that the nvcc
apparently doesn't understand how the struct is supposed to be
padded. As
far as I have understood, Eigen pads the struct such that every Eigen
member starts on a 16 byte boundary. My c++ compiler (g++-4.9)
does this
correctly, but it seems nvcc doesn't pad correctly.

The symptoms i observe are that sizeof() and offsetof() return
different values when run the .cpp file compared to when run in
the .cu
file. In the .cpp file i get padding to 16 byte boundaries, while
in the
.cu file I get no padding at all.

The problem is independent of whether I am using

I am using an Eigen 3.3 dev snapshot taken May 30th.

For now I am padding my structs manually to ensure correct behavior,
but I am curious if this is a known issue or something that will
be fixed.

Jon E. A. Lund

 Dipl. Inf., Dipl. Math. Christoph Hertzberg

 Universität Bremen
 FB 3 - Mathematik und Informatik
 AG Robotik
 Robert-Hooke-Straße 1
 28359 Bremen, Germany

 Zentrale: +49 421 178 45-6611

 Besuchsadresse der Nebengeschäftsstelle:
 Robert-Hooke-Straße 5
 28359 Bremen, Germany

 Tel.:    +49 421 178 45-4021
 Empfang: +49 421 178 45-6600
 Fax:     +49 421 178 45-4150
 E-Mail:  chtz@xxxxxxxxxxxxxxxxxxxxxxxx

 Weitere Informationen:

Mail converted by MHonArc 2.6.19+