Re: [eigen] Struct padding with the nvcc compiler.

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On 2016-06-02 19:38, Christoph Hertzberg wrote:
On 2016-06-02 18:57, Benoit Steiner wrote:
It should be possible to use something like
#ifdef __CUDA_ARCH__
#define EIGEN_MAX_STATIC_ALIGN_BYTES 16
#endif

__CUDA_ARCH__ is defined iff nvcc compiles a cuda kernel, so this won't
conflict with the host side (and therefore this won't be a problem if
we're
compiling whost code with AVX enabled)

I think the problem Gael was referring to is that if the host code is
compiled with AVX (and therefore EIGEN_MAX_STATIC_ALIGN_BYTES=32), the
CUDA part should be compiled with 32 byte alignment as well. Otherwise,
you'll still get different padding.

Actually, this is the same problem as when linking object files compiled with AVX and SSE together. The user can workaround this by pre-defining EIGEN_MAX_STATIC_ALIGN_BYTES to either 16 or 32, but when relying on the default, this _could_ result in very hard to find errors (more or less depending on what you put inside your structures and where they are accessed).





Christoph



On Thu, Jun 2, 2016 at 4:42 AM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx>
wrote:

Thanks, so that means that once passed to "cudacc", Eigen does not
try to
align at all. We could enable 16 bytes alignment by default, but this
will
still fail if compiled with AVX and 32 bytes alignment.

Does anyone see a way to automatically pass to the cuda part the
knowledge
that the host part will be compiled with some given
EIGEN_MAX_STATIC_ALIGN_BYTES
values?

gael

On Thu, Jun 2, 2016 at 10:54 AM, Jon E. A. Lund <jonealund@xxxxxxxxx>
wrote:

Hello again,

Both suggestions fixed the problem. They both work on their own, and
also
together. Thanks!

On Wed, Jun 1, 2016 at 9:55 PM, Gael Guennebaud <
gael.guennebaud@xxxxxxxxx> wrote:

Hi,

there are two possibilities: either Eigen don't request alignment when
compiled with nvcc, or nvcc does not comply with our alignment
request. To
investigate the issue, you can try to #define
EIGEN_MAX_STATIC_ALIGN_BYTES=16 before any Eigen's header (or pass
it to
nvcc compiler). You can also try to add EIGEN_ALIGN_TO_BOUNDARY(16)
before
any Eigen's member required alignment, e.g.:

struct bar {
   Vector4f v1;
   float x, y;
   EIGEN_ALIGN_TO_BOUNDARY(16) Matrix4f m1;
};

Finally, you might also consider re-ordering your members to put the
largest object first so that no padding (and thus waste of memory)
will be
required.

Let us know about the outcome!

gael



On Tue, May 31, 2016 at 2:56 PM, Jon E. A. Lund <jonealund@xxxxxxxxx>
wrote:

Hi,

I'm using Eigen in a Cuda program, and want to use a Struct
containing
multiple Eigen members. The problem I am having is that the nvcc
compiler
apparently doesn't understand how the struct is supposed to be
padded. As
far as I have understood, Eigen pads the struct such that every Eigen
member starts on a 16 byte boundary. My c++ compiler (g++-4.9)
does this
correctly, but it seems nvcc doesn't pad correctly.

The symptoms i observe are that sizeof() and offsetof() return
different values when run the .cpp file compared to when run in
the .cu
file. In the .cpp file i get padding to 16 byte boundaries, while
in the
.cu file I get no padding at all.

The problem is independent of whether I am using
the EIGEN_MAKE_ALIGNED_OPERATOR_NEW macro or not.

I am using an Eigen 3.3 dev snapshot taken May 30th.

For now I am padding my structs manually to ensure correct behavior,
but I am curious if this is a known issue or something that will
be fixed.

Best,
Jon E. A. Lund








--
 Dipl. Inf., Dipl. Math. Christoph Hertzberg

 Universität Bremen
 FB 3 - Mathematik und Informatik
 AG Robotik
 Robert-Hooke-Straße 1
 28359 Bremen, Germany

 Zentrale: +49 421 178 45-6611

 Besuchsadresse der Nebengeschäftsstelle:
 Robert-Hooke-Straße 5
 28359 Bremen, Germany

 Tel.:    +49 421 178 45-4021
 Empfang: +49 421 178 45-6600
 Fax:     +49 421 178 45-4150
 E-Mail:  chtz@xxxxxxxxxxxxxxxxxxxxxxxx

 Weitere Informationen: http://www.informatik.uni-bremen.de/robotik



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/