[eigen] 32 byte alignment for avx

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

On 11/26/2011 05:50 PM, Benoit Jacob wrote:
We never align to more than 16 bytes, so requiring higher alignment than that is wrong.

Has anyone talked about aligning to 32 byte boundaries for AVX SIMD?

For those who don't know: Starting with Sandy Bridge, intel processors have a new flavor of SIMD: AVX, which uses 256 bit registers. This allows us to do the same operation to 8 floats at once, vs. 4 floats at once using SSE thru SSE4.2. The instruction set is structured such that future processor generations may use 512 bits or beyond.

I've done a bit for my work and the speedups are impressive. Not quite the 2x that the increase in register width would suggest, but often 30-60% speedups on CPU-intensive problems.

-- Mark

Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/