RE: [eigen] 3.3-beta2 released!
• To: <eigen@xxxxxxxxxxxxxxxxxxx>
• Subject: RE: [eigen] 3.3-beta2 released!
• From: <Daniel.Vollmer@xxxxxx>
• Date: Thu, 28 Jul 2016 10:46:58 +0000
• Accept-language: de-DE, en-US

```Hi Gael,

> With float I get a nearly x2 speedup for the above 5x5 matrix-vector
> products (compared to 3.2), and x1.4 speedup with double.

I tried out this version (ca9bd08) and the results are as follows:
Note: the explicit solver pretty much only does residual evaluations,
whereas the implicit solver does a residual evaluation, followed by a
Jacobian computation (using AutoDiffScalar) and then a block-based
Gauss-Jacobi iteration where the blocks are 5x5 matrices to
approximately solve a linear system based on the Jacobian and the
residual.

Explicit solver:
----------------
eigen-3.3-ca9bd08                 10.9s => 09% slower
eigen-3.3-beta2                   11.1s => 11% slower
eigen-3.3-beta2 UNALIGNED_VEC=0   10.0s => 00% slower
eigen-3.2.9                       10.0s => baseline

Implicit solver:
----------------
eigen-3.3-ca9bd08                 34.2s => 06% faster
eigen-3.3-beta2                   37.5s => 03% slower
eigen-3.3-beta2 UNALIGNED_VEC=0   38.2s => 05% slower
eigen-3.2.9                       36.5s => baseline

So the change definitely helps for the implicit solver (which has lots
of 5x5 by 5x1 double multiplies), but for the explicit solver the
overhead of unaligned vectorization doesn't pay off. Maybe the use of
3D vectors (which used for geometric normals and coordinates) is
problematic because it's such a borderline case for vectorization?

What I don't quite understand is the difference between 3.2.9 (which
doesn't vectorize the given matrix sizes) and 3.3-beta2 without
vectorization: Something in 3.3 is slower under those conditions, but
maybe it's not the matrix-vector multiplies, as it could also be
AutoDiffScalar being slower.

Best regards

Daniel Vollmer

--------------------------
Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
German Aerospace Center
Institute of Aerodynamics and Flow Technology | Lilienthalplatz 7 | 38108 Braunschweig | Germany

Daniel Vollmer | AS C²A²S²E
www.DLR.de

```

 Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/