Re: [eigen] Matrix multiplication much slower on MSVC than on g++/clang

[ Thread Index | Date Index | More Archives ]

Hi Patrik,

On 2/8/2018 3:08 PM, Patrik Huber wrote:
I think this is incorrect though. I thin /fp:fast is not needed for MSVC to generate FMA code. Also, gcc and clang can generate FMA code without -ffast-math (which I guess is sort-of equivalent to /fp:fast).

Using /fp:fast is not necessary for the intrinsics, but without it, I can't get this to generate an vfmadd instruction:
// Test with: cl /Fa /O2 /arch:AVX2 /fp:fast foo.cpp
// Generates foo.exe and foo.asm

float mul_add(float a, float b, float c) {
    return a*b + c;

int main()
    return 0;

Best regards,

Mail converted by MHonArc 2.6.19+