Re: [eigen] AVX support and matrix product updates

2014-04-23 7:35 GMT-04:00 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:

Hi,

The support for AVX in Benoit Steiner's branch [1] is now well advanced, and I think it's time to merge it back into the main devel branch. See below for some details. Before doing so, it would nice if one or two warriors could test it on their code-base to assess there is no obvious regressions.

This AVX branch, also includes our recent efforts in improving the dense matrix product kernel for recent architectures, and the results are promising ;) [2,3]. Based on early benchmarks, the regression on old architectures (e.g., core2) should be quite limited, but I've no clue about ARM/NEON and I cannot test this platform myself. Altivec is likely to be broken because it has not been tested all.

Regarding AVX, currently Eigen does not know how to automatically fallback to smaller packets when they could be advantageous. Therefore, enabling AVX disables the vectorization of, for instance, Vector4f or Vector2d. This is only a limitation for people mixing small and large matrices in the same compilation unit. Fixing this before evaluators would be point-less and I think it's fine to have such a behavior as a first step.

Let me also say that with default gcc's ABI, __m128 and __m256 are the same type making it impossible to have overloads for both types. As gcc tells us in its error message, the solution is to compile with both "-mavx" and "-mabi-version=4", however, I preferred to workaround this annoying linking error by introducing a small wrapper class for packet types (see https://bitbucket.org/benoitsteiner/eigen/commits/5b40cf4672801826f2bb7b816528a00811cacc9d#chg-Eigen/src/Core/arch/SSE/PacketMath.h)

cheers,
Gael

[1] AVX branch: https://bitbucket.org/benoitsteiner/eigen
[2] sgemm on Haswell (AVX+FMA): http://download.tuxfamily.org/eigen/gemm_float_Haswell.pdf
[3] dgemm on Haswell (AVX+FMA): http://download.tuxfamily.org/eigen/gemm_double_Haswell.pdf