Re: [eigen] Optimization advice for a specific expression

[ Thread Index | Date Index | More Archives ]

On 2016-02-05 13:36, Alberto Luaces wrote:
| Eigen version | General algorithm | Hand-coded algorithm |
| 3.2.7         | 0.10s             | 0.04s                |
| 3.3-beta1     | 0.21s             | 0.15s                |

I am attaching a minimal test case for reference.  The bottleneck lies
on the function InertiaTensor::addFace().  The data from the table were
computed with the compilation flags "-O3 -DNDEBUG".  Eigen 3.3-beta1
reports its version as "3.2.92"

That regression definitely does not look good. On my machine, I can only confirm the regression for the hand-coded version, however. The reason appears to be a call to
which is not inlined. I was able to fix that by adding lots of EIGEN_STRONG_INLINE in src/Core/AssignEvaluator.h

@Gael, can you confirm? Or is it better to use EIGEN_ALWAYS_INLINE, here?

Other than that, your code is still not optimal regarding vectorization, partially that is Eigen's "fault", but it is quite hard to automatically decide what can be vectorized efficiently.
G.template leftCols<3>() * G.template leftCols<3>().transpose() + w * w.transpose();
gets vectorized, whereas the following does not:
G.template leftCols<3>() * G.template topRightCorner<3,3>().transpose() + w * w.template head<3>().transpose(); OTOH, your version is not vectorizable (without making the vectorization logic extremely complicated), since G.block<3,3>() will not be accessed packet-wise.


 Dipl. Inf., Dipl. Math. Christoph Hertzberg

 Universität Bremen
 FB 3 - Mathematik und Informatik
 AG Robotik
 Robert-Hooke-Straße 1
 28359 Bremen, Germany

 Zentrale: +49 421 178 45-6611

 Besuchsadresse der Nebengeschäftsstelle:
 Robert-Hooke-Straße 5
 28359 Bremen, Germany

 Tel.:    +49 421 178 45-4021
 Empfang: +49 421 178 45-6600
 Fax:     +49 421 178 45-4150
 E-Mail:  chtz@xxxxxxxxxxxxxxxxxxxxxxxx

 Weitere Informationen:

Mail converted by MHonArc 2.6.19+