Re: [eigen] update |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] update*From*: Benoît Jacob <jacob@xxxxxxxxxxxxxxx>*Date*: Wed, 10 Oct 2007 08:47:36 +0200

On Monday 08 October 2007 09:54:34 Benoît Jacob wrote: > 7) Optimization: I reversed the order of some loops (like the inner loop of > matrix-matrix multiplication) and got a *huge* speedup. > Here's the result I get with our benchmark (g++ 4.2.1, Intel Core1 > 1.66GHz): > > TVMET: 6.1 seconds > Eigen2 with hand-unrolling of the matrix-product: 5.2 seconds > Eigen2 with meta-unrolling: 5.5 seconds > Eigen2 with reversed meta-unrolling: 3.4 seconds Argh, I was mistaken... the "speedup" was only the consequence of a bug in my loop unrolling, giving wrong results. I fixed that and improved a bit the metaprograms, now the same test runs in 5.2 seconds, which is the same performance as we got with manual loop unrolling. The order of loops does matter, but it is the direct order that is faster (5-10%). Actually the first implementation had (unwittingly) reversed loops, which explains for its slightly lower speed (5.5s vs. 5.2s). Oh and I also ran the same test in Eigen1, using the overloaded operators. Result: 8.2 seconds. Better performance could have been achieved using Eigen1's C-style functions, though, but the API is ugly. So the updated table is: Eigen1: 8.2s TVMET: 6.1s Eigen2: 5.2s By the way, we now have fuzzy compares and random generators, so we can now code _real_ unit-tests, which should prevent that kind of bug from happening again. Example of the API: cout << EiVector4d::random().normalized() << endl; // rand. unit vector cout << EiMatrix3d::random() << endl; cout << EiMatrixXd::random(8, 12) << endl; // 8 rows, 12 cols, dyn. size Cheers, Benoit > > Trying to understand this speedup I ran cachegrind (with only 100000 > repetitions) and found this difference: > without reversing of loops: > ==9840== D refs: 8,648,008 (6,065,012 rd + 2,582,996 wr) > with reversing of loops: > ==9834== D refs: 5,048,066 (2,765,055 rd + 2,283,011 wr) > > Anyway, Eigen2 is now almost twice faster than before -- when it already > was faster than TVMET and Eigen1. > > Cheers, > Benoit

**Attachment:
signature.asc**

**References**:**[eigen] update***From:*Benoît Jacob

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] update** - Next by Date:
**[eigen] Namespaces** - Previous by thread:
**Re: [eigen] update** - Next by thread:
**[eigen] Namespaces**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |