Re: [eigen] Re: Eigen 2 design |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
On Monday 14 May 2007 21:47:15 Cyrille Berger wrote: > Hello, > > > > a) the matrix array should be aligned to 16-byte boundary (just > > > add "__attribute__ ((aligned (16)))" ), SSE has unaligned and aligned > > > loads/stores, and it seems that the unaligned stuff can be slow > > be careful, it's not portable. OK, then we can have a macro enclosing that in appropriate ifdefs, yada, yada... anyway don't worry, i care very much about Eigen portability. Thanks for pointing that out. > > Yeah but runtime checks mean a nasty overhead due to checking in every > > function whether SSE is present, right? I mean, if a function has two > > codepaths, there has to be an "if" to determine which codepath to follow. > > That seems to be a no-go to me, so I'd favor a compile-time approach. > > then you can consider to try if the -ftree-vectorize doesn't give a speed > boost without writting your own code for it. I didn't know about -ftree-vectorize, and after reading the gcc man page I still don't understand what it does. I guess it tells the compilers, "use SSE or any similar technology when it's present", right? If yes, that's interesting, and yes, before doing lowlevel optimization it's good to check if the compiler isn't able of doing that already. However, we're a template library, so we don't control compilation. So when we find out that certain compiler options speed up the code, the best we can do is set up a documentation page explaining that. For instance on the Eigen Mainpage I explain that -DNDEBUG is important for performance. I'm not for pragmas. Cheers, Benoit
Attachment:
pgpG0DKmwDKWJ.pgp
Description: PGP signature
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |