Re: [eigen] Matrix multiplication much slower on MSVC than on g++/clang |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
On 2/9/2018 3:16 AM, Gael Guennebaud wrote:
That works! For detection, the documentation at https://msdn.microsoft.com/en-us/library/b0084kay.aspx <https://msdn.microsoft.com/en-us/library/b0084kay.aspx> suggests that perhaps this will work: #if defined(_MSC_VER) && defined(__AVX2__) #define __FMA__ #endifTo implement that we need to make sure that on all architectures AVX2 => FMA. This seems to be true for Intel's ones, but I'm not sure about AMD.
According to https://stackoverflow.com/questions/16348909/how-do-i-know-if-i-can-compile-with-fma-instruction-sets , all AMD processors which support AVX2 support FMA. Unfortunately, I couldn't easily confirm through official online resources. The wikipedia page on Advanced_Vector_Extensions notes that only AMD Excavactor processors (and up) support AVX2, and those definitely support FMA (double-checked at https://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf).
-Edward
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |