On Tue, Aug 28, 2012 at 5:14 PM, Christoph Hertzberg
<chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Actually, it would be nice to optionally support padding in Eigen as well,
sth like padding to the next multiply of packet-size or so.
This would also have the benefit that e.g. col(i) expressions can actually
return an aligned map.
yes, for small objects that can offer a real speedup. We could either
extend Matrix, in which I'd rather add an option bit flag, or even add
a new class.
Hm, I wouldn't think it's a good idea to optimize this automatically without
warnings/special flags. I can't think of an example right now where it would
harm (of course, if v is a matrix as well, it's easy to find examples where
the first is better than the second), but I would like to be able to
distinguish between (A*B)*v and A*(B*v).
That's the usual remark, and the usual answer is:
(A*B).parenthesis() * v;
(or .group(), or whatever...)
Also note that any optimized math library like Eigen is already
putting arbitrary parenthesis. For instance a simple v.sum() with
vectorization is not computed as:
v0 + v1 + v2 + v3 + ...
but as
(v0 + v2 + v4 + ...) + (v1 + v3 + ...)
or even:
((v0+v1) + (v2+v3)) + ((v4+v5) + (v6+v7))
when unrolling occurs.
Similar arbitrary parenthesis occur inside matrix products, so
implicitly computing A*(B*v) instead of (A*B)*v should exceptionally
be an issue.