Re: [eigen] Matrix multiplication much slower on MSVC than on g++/clang |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
Hi Patrik,
On 2/8/2018 3:08 PM, Patrik Huber wrote:
>
I think this is incorrect though. I thin /fp:fast is not needed for MSVC to generate FMA code. Also, gcc and clang can generate FMA code without -ffast-math (which I guess is sort-of equivalent to /fp:fast).
Using /fp:fast is not necessary for the intrinsics, but without it, I can't get this to generate an vfmadd instruction:
=========
//foo.cpp
//
// Test with: cl /Fa /O2 /arch:AVX2 /fp:fast foo.cpp
// Generates foo.exe and foo.asm
float mul_add(float a, float b, float c) {
return a*b + c;
}
int main()
{
return 0;
}
=========
Best regards,
-Edward
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |