Re: [eigen] Re: Eigen 2 design

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Monday 14 May 2007 21:47:15 Cyrille Berger wrote:
> Hello,
>
> > > a) the matrix array should be aligned to 16-byte boundary (just
> > > add "__attribute__ ((aligned (16)))" ), SSE has unaligned and aligned
> > > loads/stores, and it seems that the unaligned stuff can be slow
>
> be careful, it's not portable.

OK, then we can have a macro enclosing that in appropriate ifdefs, yada, 
yada... anyway don't worry, i care very much about Eigen portability. Thanks 
for pointing that out.

> > Yeah but runtime checks mean a nasty overhead due to checking in every
> > function whether SSE is present, right? I mean, if a function has two
> > codepaths, there has to be an "if" to determine which codepath to follow.
> > That seems to be a no-go to me, so I'd favor a compile-time approach.
>
> then you can consider to try if the -ftree-vectorize doesn't give a speed
> boost without writting your own code for it.

I didn't know about -ftree-vectorize, and after reading the gcc man page I 
still don't understand what it does. I guess it tells the compilers, "use SSE 
or any similar technology when it's present", right?  If yes, that's 
interesting, and yes, before doing lowlevel optimization it's good to check 
if the compiler isn't able of doing that already.

However, we're a template library, so we don't control compilation. So when we 
find out that certain compiler options speed up the code, the best we can do 
is set up a documentation page explaining that. For instance on the Eigen 
Mainpage I explain that -DNDEBUG is important for performance. I'm not for 
pragmas.

Cheers,
Benoit

Attachment: pgpG0DKmwDKWJ.pgp
Description: PGP signature



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/