Re: [eigen] 3.3-beta2 released!

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Thank you for the detailed numbers, that's very helpful.

It is quite surprising that unaligned vectorization is slowing down the execution, though. What's your CPU? What if you completely disable explicit vectorization? (-DEIGEN_DONT_VECTORIZE)

gael

On Wed, Jul 27, 2016 at 7:35 PM, <Daniel.Vollmer@xxxxxx> wrote:
Hello,

a small update: The slowdown from 3.2.9 to 3.3-beta2 in my case seems to be entirely down to the usage of unaligned vectorisation. If I turn that off with -DEIGEN_UNALIGNED_VECTORIZE=0, then 3.3 performs the same (or very, very slightly faster) as 3.2.9. Although compile times for 3.3 did increase noticeably (e.g. our code with 3.2.9 takes 1m52.533s to build, with 3.3 it takes 2m17.825s).


Best regards

Daniel Vollmer

--------------------------
Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
German Aerospace Center
Institute of Aerodynamics and Flow Technology | Lilienthalplatz 7 | 38108 Braunschweig | Germany

Daniel Vollmer | AS C²A²S²E
www.DLR.de

________________________________________
Von: Vollmer, Daniel
Gesendet: Mittwoch, 27. Juli 2016 17:48
An: eigen@xxxxxxxxxxxxxxxxxxx
Betreff: RE: [eigen] 3.3-beta2 released!

Hi,

thanks for everyone's efforts. The detailed changelog and release notes are very helpful.

I've tried out our code with Eigen 3.3-beta2 (and with some fixes to unsupported/AutoDiffScalar and some massaging around clang) it now compiles. :)

Using Eigen-3.3-beta2 versus 3.2.9 results in a slow-down of about 15% with g++-6.1 and a slow-down of about 10% using clang Apple LLVM version 7.3.0 (clang-703.0.31). This was compiling with -Ofast and -DNDEBUG.

We don't do anything fancy in our CFD code, mainly small, fixed size (e..g. 5x5 / 5x1) matrix and vector products,  occasionally hard-coding specific matrix decompositions, and a fair amount of direct element accesses (either single coeff, or row/ col / segment based).
Unfortunately, I find it quite difficult to extract helpful (or actionable) profiles to see what sort changes may be causing the differences for us. Our code (like many C++ codes) is quite sensitive to inlining decisions by the compiler.


Best regards

Daniel Vollmer

--------------------------
Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
German Aerospace Center
Institute of Aerodynamics and Flow Technology | Lilienthalplatz 7 | 38108 Braunschweig | Germany

Daniel Vollmer | AS C²A²S²E
www.DLR.de




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/