Re: [eigen] Alignment issues

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Speaking of alignment issues, according to http://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html the glibc malloc always returns addresses aligned on 16 bytes when running on 64-bit systems. However in order to reduce fragmentation some memory allocators such as tcmalloc (http://goog-perftools.sourceforge.net/doc/tcmalloc.html) relax this rule when allocating fewer than 16 bytes.

There is no way to issue a pload or pstore when processing matrices containing fewer than 4 floats or 2 doubles, so this behavior doesn't affect the correctness of the code. However it triggers a few eigen_assert unless EIGEN_NO_DEBUG is defined when the code is compiled.

Benoit




On Thu, Feb 13, 2014 at 2:36 PM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:

I tried the following: http://ideone.com/XpgXyG

and with clang3.3/macosx 64bits, I get:

Address 0x7fff5274aa60 is 32 bytes aligned.

Address 0x7fff5274a980 is 128 bytes aligned.

Address 0x7fb9d9403940 is 64 bytes aligned.

Address 0x7fb9d94039c0 is 64 bytes aligned.

Address 0x7fb9d9801200 is 512 bytes aligned.

Address 0x7fb9d9801800 is 1024 bytes aligned.


meaning that std::vector seems to honor the alignas directive while new does not. On ideone it's even worse::



Address 0xffffffffbfbf2780 is 128 bytes aligned.
Address 0xffffffffbfbf2800 is 1024 bytes aligned.
Address 0xa047008 is 8 bytes aligned.
Address 0xa047090 is 16 bytes aligned.
Address 0xa047198 is 8 bytes aligned.
Address 0xa0476a0 is 32 bytes aligned.



gael


On Thu, Feb 13, 2014 at 9:43 PM, Hauke Heibel <hauke.heibel@xxxxxxxxx> wrote:
Hi,

I tested with this (http://ideone.com/rqkjuo) small program on MSVC 2012 (VC11) in 32bit and 64bit modes. The results are:

32bit:
Address 0x1cfd70 is 16 bit aligned.
Address 0x3e59e0 is 32 bit aligned.
Address 0x3e7810 is 16 bit aligned.
Address 0x3e7958 is 8 bit aligned.

64bit:
Address 0x2df830 is 16 bit aligned.
Address 0x4e2d30 is 16 bit aligned.
Address 0x4e77d0 is 16 bit aligned.
Address 0x4e8b50 is 16 bit aligned.

I think the results are the same with MSVC 2010 (VC10). That means MSVC users who are using 64bit code are fine and need to do nothing special (this is what I do for quite some time now). The stack alignment is the minimal value form a few runs. I did not test std::allocator since std::vector is using it anyways if I am not totally wrong. The 32 bit alignment for a single struct on the heap with 32 bit compilation is consistent over multiple runs.

Regards,
Hauke


On Thu, Feb 13, 2014 at 7:39 PM, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
On 13.02.2014 19:04, Nicola Gigante wrote:
I can’t test that snippet because on OS X memory is always at least aligned at 16 bytes.

Could you try new, new[] and std::vector with (e.g.) a 1024 byte aligned structure? It could be that starting from certain OS versions or for 64bit systems malloc (and new) simply is 16 byte aligned by default (but still don't actually cares about proper alignment).
My malloc man page only says "[...] memory that is suitably aligned for any kind of variable.", which seems only to be implemented as 8 bytes.


Please note that, especially regarding C++11 support, clang 3.1 is too old, and that gcc 4.7.0 have terrible bugs, so
everybody have to use at least 4.7.1 anyway.

I do have gcc4.7.1 here (sorry for the inaccuracy), but clang 3.1 is all I have on this machine. I have clang 3.3 on another (which however also is 64bit), I could try out what happens there.


What I’d suggest is to make a survey of the behaviors of various compiler/library/OS combinations, and clarify in the documentation
what is needed with a fully standard-compilant platform (in C++11, I suppose actually nothing) and what instead are workarounds for specific
compiler bugs or limitations.

Yes, I would also assume that the C++11 standard requires new, new[] and std::allocator to meet the alignment requirements.

Could we maybe make a small test program which does all tests? (E.g. calling new, new[] and std::allocate (anything else? Maybe testing stack alignment somehow?) several times for structs with increasing alignment requirements and finding the lowest bit set in the addresses.


Christoph

--
----------------------------------------------
Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: +49 (421) 218-64252
----------------------------------------------







Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/