Re: [eigen] Matrix multiplication much slower on MSVC than on g++/clang

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi Patrik,

On 2/8/2018 3:08 PM, Patrik Huber wrote:
>
I think this is incorrect though. I thin /fp:fast is not needed for MSVC to generate FMA code. Also, gcc and clang can generate FMA code without -ffast-math (which I guess is sort-of equivalent to /fp:fast).


Using /fp:fast is not necessary for the intrinsics, but without it, I can't get this to generate an vfmadd instruction:
=========
//foo.cpp
//
// Test with: cl /Fa /O2 /arch:AVX2 /fp:fast foo.cpp
// Generates foo.exe and foo.asm

float mul_add(float a, float b, float c) {
    return a*b + c;
}

int main()
{
    return 0;
}
=========

Best regards,
-Edward



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/