Re: [eigen] Optimization advice for a specific expression

[ Thread Index | Date Index | More Archives ]

Christoph Hertzberg writes:


> gets vectorized, whereas the following does not:
>   G.template leftCols<3>() * G.template
> topRightCorner<3,3>().transpose() + w * w.template
> head<3>().transpose();
> OTOH, your version is not vectorizable (without making the
> vectorization logic extremely complicated), since G.block<3,3>() will
> not be accessed packet-wise.

Thanks a lot, Christoph.  This is very helpful.  Regarding those
assertions: do you have any rule of thumb to know what is getting
vectorized and what is not?

I get quickly swamped by all of those mul... and add... SSE instructions
in the assembler output, and cannot clearly see if they are just
performing scalar or vector operations.  Is it maybe a matter of
checking that "packet" ops are mostly used ("P" prefix?)


Mail converted by MHonArc 2.6.19+