Re: [eigen] Performance colwise matrix mult vs. raw loop

[ Thread Index | Date Index | More Archives ]

On 2018-01-15 11:54, Gael Guennebaud wrote:
> both code are not equivalent because one is applying the transformation
> in-place, whereas the other one need to allocate a temporary. [...]

Thanks for the explanation.

> [...] the product cannot be carried out in-place because of
> aliasing issue and a temporary has to be created. Of course, as you
> realized, if we evaluate the result one column at once, then instead of
> allocating a whole 3xN temporary, it is enough to allocate a 3x1 temporary
> vector, and since the size is known at compile time and that it is very
> small, it can be "allocated" on the stack and even optimized away by the
> compiler. Unfortunately there is not way for Eigen to figure this out,
> especially at compile-time. [...]

I understand you cannot determine any aliasing issues at compile time.
But I fail to see why there is no way to determine the size of the
temporary. Do you mean this cannot be done in Eigen as it is now or this
can generally not be determined at compile time?
> In your case, we would need some kind of colwise/rowwise noalias [...].
> To be honest I've
> never though about that possibility, but [...] this might be worth the> effort.

Thanks for considering this possibility.

As I've already said this issue was primarily surprising to me because I
thought colwise/rowwise would essentially boil down to the loop code
I've written manually. Your explanation shed some light on this issue
but clearly demonstrated, that I need to understand Eigen internals
better, when I care about runtime performance.

Best regards,

Mail converted by MHonArc 2.6.19+