2016-02-01 17:03 GMT-05:00 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
Hi,
This email is mostly intended to Eigen's developers, but any users
concerned about Eigen's dynamic aligned memory allocation mechanism is
welcome to share its thoughts and expertise.
As discussed in bugs 761 and 779, our dynamic memory allocation mechanism
would deserve some cleaning.
Wouldn't both of these bugs be averted if we dropped
EIGEN_MALLOC_ALREADY_ALIGNED ?
I understand that it tends to perform best because it makes us hit the
most common case in the memory allocator. But like I tried to express in
this thread,
http://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2014/12/msg00059.html
it relies on the unsafe assumption that the memory allocator is a property
of the platform and can thus be inferred from preprocessor checks.
That's not the case: many large applications use their own memory
allocator, and that's a runtime property, not something that Eigen could
detect at compile time.
(Here I need to pause to acknowledge responsibility in this problem that
I'm describing!)
So while I understand that for many users EIGEN_MALLOC_ALREADY_ALIGNED is
in fact safe and performs best, it is on the whole an unsafe optimization.
Maybe it should not be the default? How about keeping it around (if you
wish) as a non-default optimization option.
Another thought: do any performance differences here actually matter
outside of micro-benchmarks, and if yes, is that a sign of a problem that
we could try to address at a different level, trying to change either
implementation of interface to reduce allocator overhead?
Cheers,
Benoit
As you can see there [3], the current logic is rather complicated and
thus prone to bugs and it is difficult to predict what will be used under
the hood.
I did some testing on both Linux and OSX, and I found that both
posix_memalign and _mm_malloc are slightly slower than our
handmade_aligned_malloc which is as fast as the non-aligned std::malloc
(probably thanks to inlining). I observed almost a factor 2 for 32 bytes
alignment.
Regarding memory overhead, there is no winners for 32 bytes alignment,
but for 16 byes alignment, std::malloc is clearly better as it is already
aligned on most systems.
With that in mind, we can remove both the posix_memalign and _mm_malloc
paths without any regrets.This means that on Unix, std::malloc will never
be bypassed anymore.
It remains the question of windows and its _aligned_malloc routine that I
did not benched. For consistency with unix systems, I would propose to
remove that path too.
Then, I also propose to keep EIGEN_MALLOC_ALREADY_ALIGNED as it is
clearly better regarding memory consumption. In case of trouble, one can
still compile with -DEIGEN_MALLOC_ALREADY_ALIGNED=0 to bypass the default
assumptions (e.g. if using a custom non-aligned malloc). Nonetheless, to be
future proof, I propose to extend this notion of
EIGEN_MALLOC_ALREADY_ALIGNED to something more general as
EIGEN_MALLOC_DEFAULT_ALIGNMENT whose value would be the default alignment
in bytes (e.g., 8, 16, 32?).
Finally, to reduce the overhead of handmade_aligned_malloc for tiny
buffers, I propose to disable the alignment of buffers smaller than the
requested alignment. This can easily be done by passing the buffer size
to aligned_free().
Those changes should be the latest major ones before Eigen 3.3 beta2 and
the RCs...
cheers,
Gael.
[1] http://eigen.tuxfamily.org/bz/show_bug.cgi?id=761
[2] http://eigen.tuxfamily.org/bz/show_bug.cgi?id=779
[3]
https://bitbucket.org/eigen/eigen/src/d27219db093c1dc9a09aac0e429b90212318d75c/Eigen/src/Core/util/Memory.h?at=default&fileviewer=file-view-default#Memory.h-212