Re: [eigen] Re: Eigen 2 design |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
Hello,
> > a) the matrix array should be aligned to 16-byte boundary (just
> > add "__attribute__ ((aligned (16)))" ), SSE has unaligned and aligned
> > loads/stores, and it seems that the unaligned stuff can be slow
be careful, it's not portable.
> > b) it should be easy to add specialized cases, like <float, 4, 4> etc.
> >
> > * How should we enable the SSE support? #ifdef __SSE__ or runtime
> > checks? Runtime checks could enable even the generic distro packages to
> > have SSE support.
>
> Yeah but runtime checks mean a nasty overhead due to checking in every
> function whether SSE is present, right? I mean, if a function has two
> codepaths, there has to be an "if" to determine which codepath to follow.
> That seems to be a no-go to me, so I'd favor a compile-time approach.
then you can consider to try if the -ftree-vectorize doesn't give a speed
boost without writting your own code for it.
--
Cyrille Berger