[eigen] Re: Eigen 2 design |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
On Thursday 26 April 2007 08:42, Benoît Jacob wrote:
> It's true that in eigen-basic, expression templates are the only way
> to achieve fast arithmetic operators, since copy-on-write is not an
> option there (there must be no dynamic memory allocation in
> eigen-basic). Without expression templates, we'd have to do as in
> Eigen1, namely make it clear in the documentation that the arithmetic
> operators are slow and provide alternative C-style methods doing the
> same thing faster (like multiply() vs. operator*).
Hi! Couple comments:
* About the expression templates: did you already check out tvmet? It
seems to be quite close to what you have in mind for eigen-basic.
http://tvmet.sourceforge.net/introduction.html
* Some kind of performance test suite would be very useful, especially
if we add optimizations like SSE support and expression templates. It's
very easy to actually make things much slower when changing something.
Creating a good suite is probably very tricky.
* I'm planning on adding SSE support via the SSE instrinsics API. This
should provide performance gains while still being reasonably portable.
I'm not sure whether this can be easily done for any NxM matrices and
vectors -- multiples of 4 are at the sweet spot. A few more things:
a) the matrix array should be aligned to 16-byte boundary (just
add "__attribute__ ((aligned (16)))" ), SSE has unaligned and aligned
loads/stores, and it seems that the unaligned stuff can be slow
b) it should be easy to add specialized cases, like <float, 4, 4> etc.
* How should we enable the SSE support? #ifdef __SSE__ or runtime
checks? Runtime checks could enable even the generic distro packages to
have SSE support.
Regards,
Tommi Rantala