Re: [eigen] Matrix multiplication much slower on MSVC than on g++/clang

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On 2/9/2018 3:16 AM, Gael Guennebaud wrote:

    That works! For detection, the documentation at
    https://msdn.microsoft.com/en-us/library/b0084kay.aspx
    <https://msdn.microsoft.com/en-us/library/b0084kay.aspx> suggests that
    perhaps this will work:

    #if defined(_MSC_VER) && defined(__AVX2__)
    #define __FMA__
    #endif

To implement that we need to make sure that on all architectures AVX2 => FMA. This seems to be true for Intel's ones, but I'm not sure about AMD.


According to https://stackoverflow.com/questions/16348909/how-do-i-know-if-i-can-compile-with-fma-instruction-sets , all AMD processors which support AVX2 support FMA. Unfortunately, I couldn't easily confirm through official online resources. The wikipedia page on Advanced_Vector_Extensions notes that only AMD Excavactor processors (and up) support AVX2, and those definitely support FMA (double-checked at https://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf).

-Edward



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/