Re: [eigen] Calls to copyPacketByOuterInner: when do they happen?

[ Thread Index | Date Index | More Archives ]

On 25.11.2013 16:25, Martin Felis wrote:
benchmarking it turns out that my changes provide a nice speedup (2.0X)
in some algorithms, while in others it is unpleasantly slower (0.5X).

Profiling (using google perftools) suggests that the code spends ~20%
more time in the copyPacketByOuterInner when using the newer joint models.

The question is now: what operations in Eigen result in calling of the
function and can they be avoided?

It is called, whenever an expression is evaluated in two loops, where the outer loop goes over the outer index and the inner loop over the inner index (loops can be unrolled or not).

Does it need 20% more time per call or is it called 20% more often?
Neither of that would account for a performance decrease by 50%.

I doubt that you want to "avoid" calling copyPacketByOuterInner, since it is usually the fastest method of copying/evaluating. I guess the problem is rather that you do some extra computations in some cases.

The code in question can be found here:

I don't really understand what's the purpose of
in your code. This will always be true if you have Eigen included -- or does your code run with alternate matrix libraries? Do you need the ..eval() where they are used? Unnecessary evals could account for some unnecessary copy operations.

Could you link to a diff between the two versions which you profiled? Maybe something obvious pops out.

Finally, I'm sorry for asking the obvious question:
Do you compile with optimization enabled (Release or RelWithDebug mode in cmake)? Otherwise Eigen spends a lot of time in methods that should be completely unrolled and inlined.


Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: +49 (421) 218-64252

Mail converted by MHonArc 2.6.19+