Re: [eigen] Struct padding with the nvcc compiler. |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
On 2016-06-02 19:38, Christoph Hertzberg wrote:
On 2016-06-02 18:57, Benoit Steiner wrote:It should be possible to use something like #ifdef __CUDA_ARCH__ #define EIGEN_MAX_STATIC_ALIGN_BYTES 16 #endif __CUDA_ARCH__ is defined iff nvcc compiles a cuda kernel, so this won't conflict with the host side (and therefore this won't be a problem if we're compiling whost code with AVX enabled)I think the problem Gael was referring to is that if the host code is compiled with AVX (and therefore EIGEN_MAX_STATIC_ALIGN_BYTES=32), the CUDA part should be compiled with 32 byte alignment as well. Otherwise, you'll still get different padding.
Actually, this is the same problem as when linking object files compiled with AVX and SSE together. The user can workaround this by pre-defining EIGEN_MAX_STATIC_ALIGN_BYTES to either 16 or 32, but when relying on the default, this _could_ result in very hard to find errors (more or less depending on what you put inside your structures and where they are accessed).
ChristophOn Thu, Jun 2, 2016 at 4:42 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:Thanks, so that means that once passed to "cudacc", Eigen does not try to align at all. We could enable 16 bytes alignment by default, but this will still fail if compiled with AVX and 32 bytes alignment. Does anyone see a way to automatically pass to the cuda part the knowledge that the host part will be compiled with some given EIGEN_MAX_STATIC_ALIGN_BYTES values? gael On Thu, Jun 2, 2016 at 10:54 AM, Jon E. A. Lund <jonealund@xxxxxxxxx> wrote:Hello again, Both suggestions fixed the problem. They both work on their own, and also together. Thanks! On Wed, Jun 1, 2016 at 9:55 PM, Gael Guennebaud < gael.guennebaud@xxxxxxxxx> wrote:Hi, there are two possibilities: either Eigen don't request alignment when compiled with nvcc, or nvcc does not comply with our alignment request. To investigate the issue, you can try to #define EIGEN_MAX_STATIC_ALIGN_BYTES=16 before any Eigen's header (or pass it to nvcc compiler). You can also try to add EIGEN_ALIGN_TO_BOUNDARY(16) before any Eigen's member required alignment, e.g.: struct bar { Vector4f v1; float x, y; EIGEN_ALIGN_TO_BOUNDARY(16) Matrix4f m1; }; Finally, you might also consider re-ordering your members to put the largest object first so that no padding (and thus waste of memory) will be required. Let us know about the outcome! gael On Tue, May 31, 2016 at 2:56 PM, Jon E. A. Lund <jonealund@xxxxxxxxx> wrote:Hi, I'm using Eigen in a Cuda program, and want to use a Struct containing multiple Eigen members. The problem I am having is that the nvcc compiler apparently doesn't understand how the struct is supposed to be padded. As far as I have understood, Eigen pads the struct such that every Eigen member starts on a 16 byte boundary. My c++ compiler (g++-4.9) does this correctly, but it seems nvcc doesn't pad correctly. The symptoms i observe are that sizeof() and offsetof() return different values when run the .cpp file compared to when run in the .cu file. In the .cpp file i get padding to 16 byte boundaries, while in the .cu file I get no padding at all. The problem is independent of whether I am using the EIGEN_MAKE_ALIGNED_OPERATOR_NEW macro or not. I am using an Eigen 3.3 dev snapshot taken May 30th. For now I am padding my structs manually to ensure correct behavior, but I am curious if this is a known issue or something that will be fixed. Best, Jon E. A. Lund
-- Dipl. Inf., Dipl. Math. Christoph Hertzberg Universität Bremen FB 3 - Mathematik und Informatik AG Robotik Robert-Hooke-Straße 1 28359 Bremen, Germany Zentrale: +49 421 178 45-6611 Besuchsadresse der Nebengeschäftsstelle: Robert-Hooke-Straße 5 28359 Bremen, Germany Tel.: +49 421 178 45-4021 Empfang: +49 421 178 45-6600 Fax: +49 421 178 45-4150 E-Mail: chtz@xxxxxxxxxxxxxxxxxxxxxxxx Weitere Informationen: http://www.informatik.uni-bremen.de/robotik
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |