Re: [eigen] Struct padding with the nvcc compiler.

[ Thread Index | Date Index | More Archives ]

On 2016-06-02 18:57, Benoit Steiner wrote:
It should be possible to use something like
#ifdef __CUDA_ARCH__

__CUDA_ARCH__ is defined iff nvcc compiles a cuda kernel, so this won't
conflict with the host side (and therefore this won't be a problem if we're
compiling whost code with AVX enabled)

I think the problem Gael was referring to is that if the host code is compiled with AVX (and therefore EIGEN_MAX_STATIC_ALIGN_BYTES=32), the CUDA part should be compiled with 32 byte alignment as well. Otherwise, you'll still get different padding.


On Thu, Jun 2, 2016 at 4:42 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx>

Thanks, so that means that once passed to "cudacc", Eigen does not try to
align at all. We could enable 16 bytes alignment by default, but this will
still fail if compiled with AVX and 32 bytes alignment.

Does anyone see a way to automatically pass to the cuda part the knowledge
that the host part will be compiled with some given EIGEN_MAX_STATIC_ALIGN_BYTES


On Thu, Jun 2, 2016 at 10:54 AM, Jon E. A. Lund <jonealund@xxxxxxxxx>

Hello again,

Both suggestions fixed the problem. They both work on their own, and also
together. Thanks!

On Wed, Jun 1, 2016 at 9:55 PM, Gael Guennebaud <
gael.guennebaud@xxxxxxxxx> wrote:


there are two possibilities: either Eigen don't request alignment when
compiled with nvcc, or nvcc does not comply with our alignment request. To
investigate the issue, you can try to #define
EIGEN_MAX_STATIC_ALIGN_BYTES=16 before any Eigen's header (or pass it to
nvcc compiler). You can also try to add EIGEN_ALIGN_TO_BOUNDARY(16) before
any Eigen's member required alignment, e.g.:

struct bar {
   Vector4f v1;
   float x, y;
   EIGEN_ALIGN_TO_BOUNDARY(16) Matrix4f m1;

Finally, you might also consider re-ordering your members to put the
largest object first so that no padding (and thus waste of memory) will be

Let us know about the outcome!


On Tue, May 31, 2016 at 2:56 PM, Jon E. A. Lund <jonealund@xxxxxxxxx>


I'm using Eigen in a Cuda program, and want to use a Struct containing
multiple Eigen members. The problem I am having is that the nvcc compiler
apparently doesn't understand how the struct is supposed to be padded. As
far as I have understood, Eigen pads the struct such that every Eigen
member starts on a 16 byte boundary. My c++ compiler (g++-4.9) does this
correctly, but it seems nvcc doesn't pad correctly.

The symptoms i observe are that sizeof() and offsetof() return
different values when run the .cpp file compared to when run in the .cu
file. In the .cpp file i get padding to 16 byte boundaries, while in the
.cu file I get no padding at all.

The problem is independent of whether I am using

I am using an Eigen 3.3 dev snapshot taken May 30th.

For now I am padding my structs manually to ensure correct behavior,
but I am curious if this is a known issue or something that will be fixed.

Jon E. A. Lund

 Dipl. Inf., Dipl. Math. Christoph Hertzberg

 Universität Bremen
 FB 3 - Mathematik und Informatik
 AG Robotik
 Robert-Hooke-Straße 1
 28359 Bremen, Germany

 Zentrale: +49 421 178 45-6611

 Besuchsadresse der Nebengeschäftsstelle:
 Robert-Hooke-Straße 5
 28359 Bremen, Germany

 Tel.:    +49 421 178 45-4021
 Empfang: +49 421 178 45-6600
 Fax:     +49 421 178 45-4150
 E-Mail:  chtz@xxxxxxxxxxxxxxxxxxxxxxxx

 Weitere Informationen:

Mail converted by MHonArc 2.6.19+