Re: [eigen] update

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Monday 08 October 2007 09:54:34 Benoît Jacob wrote:
> 7) Optimization: I reversed the order of some loops (like the inner loop of
> matrix-matrix multiplication) and got a *huge* speedup.
> Here's the result I get with our benchmark (g++ 4.2.1, Intel Core1
> 1.66GHz):
>
> TVMET: 6.1 seconds
> Eigen2 with hand-unrolling of the matrix-product: 5.2 seconds
> Eigen2 with meta-unrolling: 5.5 seconds
> Eigen2 with reversed meta-unrolling: 3.4 seconds

Argh, I was mistaken... the "speedup" was only the consequence of a bug in my 
loop unrolling, giving wrong results.

I fixed that and improved a bit the metaprograms, now the same test runs in 
5.2 seconds, which is the same performance as we got with manual loop 
unrolling. The order of loops does matter, but it is the direct order that is 
faster (5-10%). Actually the first implementation had (unwittingly) reversed 
loops, which explains for its slightly lower speed (5.5s vs. 5.2s).

Oh and I also ran the same test in Eigen1, using the overloaded operators. 
Result: 8.2 seconds. Better performance could have been achieved using 
Eigen1's C-style functions, though, but the API is ugly. So the updated table 
is:

Eigen1: 8.2s
TVMET:  6.1s
Eigen2: 5.2s

By the way, we now have fuzzy compares and random generators, so we can now 
code _real_ unit-tests, which should prevent that kind of bug from happening 
again.

Example of the API:

cout << EiVector4d::random().normalized() << endl; // rand. unit vector
cout << EiMatrix3d::random() << endl;
cout << EiMatrixXd::random(8, 12) << endl; // 8 rows, 12 cols, dyn. size

Cheers,
Benoit

>
> Trying to understand this speedup I ran cachegrind (with only 100000
> repetitions) and found this difference:
> without reversing of loops:
> ==9840== D   refs:       8,648,008  (6,065,012 rd + 2,582,996 wr)
> with reversing of loops:
> ==9834== D   refs:       5,048,066  (2,765,055 rd + 2,283,011 wr)
>
> Anyway, Eigen2 is now almost twice faster than before -- when it already
> was faster than TVMET and Eigen1.
>
> Cheers,
> Benoit


Attachment: signature.asc
Description: This is a digitally signed message part.



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/