Re: [eigen] Optimization advice for a specific expression

[ Thread Index | Date Index | More Archives ]


you can avoid one temp with:

I4x4.noalias() += ...;

Then slowndown might come from the compiler doing a bad job at inlining. What is your compiler? version? What are the exact compilation flags?

Also, please paste the complete declaration of call_dense_assignment_loop.


On Thu, Feb 4, 2016 at 2:10 PM, Alberto Luaces <alberto.luaces@xxxxxxxxx> wrote:
Sorry about the subject, since this question is about a very specific

I have written a program that performs as well as (or even a bit worse!)
than other algorithm that is expected to be theoretically an order of
magnitude slower.

The bottleneck is

I4x4 += G.determinant() * G * E * G.transpose();

being I4x4, G and E Eigen::Matrix4d.

- E is constant and symmetric.
- G is homogeneous.
- I4x4 is therefore also symmetric.

The profiler shows in "release" mode that most of the time is spent in

Eigen::internal::call_dense_assignment_loop<Eigen::Matrix<double, 4, 4, 0, 4, 4>, …

but I guess that "assignment" is not only the update of I4x4, but the
evaluation of the whole _expression_.

Is there any obvious optimization advice that can be applied to improve
the performance?

Thank you!

Mail converted by MHonArc 2.6.19+