Re: [eigen] Transform class performance and inconsistencies

ok, thanks for confirming this. The other strange case, that you can see in my benchmark is:

trans = trans1 * trans
float          Isometry        AutoAlign   0.0260s
float          Isometry        DontAlign   0.0207s
double   Isometry        AutoAlign   0.0202s
double   Isometry        DontAlign   0.0207s
float      Projective   AutoAlign   0.0109s
float          Projective   DontAlign   0.0386s
double   Projective   AutoAlign   0.0162s
double   Projective   DontAlign   0.0407s

In the float case for Isometry (this should be very common) the AutoAlign is slower than the DontAlign.

And both of them are much slower compared to setting Projective. I could see in the code, that the Isometry product

has a separate codepath using just the affine part, only in this case I think it prevents it from using vectorization,

which would be much faster.

cheers,

Jakob

On December 17, 2012 at 11:35 AM Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:

ok, thank you.

I implemented my own too, and I get the following results ('u' stands for unaligned):

Float:

Vec3 /Iso3 : 0.000645647

Vec3u/Iso3u: 0.000645586

Vec4 /Iso3 : 0.00126092

Vec4u/Iso3u: 0.000688231

Vec4 /Mat4 : 0.000391356

Vec4u/Mat4u: 0.000981292

Double:

Vec3 /Iso3 : 0.000649383

Vec3u/Iso3u: 0.000649382

Vec4 /Iso3 : 0.00126092

Vec4u/Iso3u: 0.000689219

Vec4 /Mat4 : 0.000475642

Vec4u/Mat4u: 0.000979767

obtained with gcc 4.6, -O2 -DNDEBUG.

The only strange result is the case "Vec4 /Iso3" which is unexpectedly twice as slow as "Vec4u/Iso3u". Looking at the assembly, the only difference between the two versions is that in the "Vec4 /Iso3" case the result is first copied into a temporary (one coefficient at a time), that is then copied using 2 movaps to the true result location. Clearly, the compiler should be able to remove this extra copy as it is done with the "Vec4u/Iso3u" case, but here it does not, I do not know why.

cheers,

gael

On Mon, Dec 17, 2012 at 10:07 AM, Jakob Schwendner <jakob.schwendner@xxxxxxx> wrote:

So I cleaned up the benchmarking that I did a little bit, and added a test to the bench folder, which should be consistent with the rest of the benchmarks there:

https://bitbucket.org/jschwendner/eigen/src/783b9386d33022047b60854c475c00dd9a7b7b0e/bench/benchGeometry.cpp?at=default

I ran the benchmark results on two different systems:

http://pastebin.com/AFe6xG0t

And still get better results for the DontAlign in situation where I wouldn't expect it. Might be my benchmarking, though...

cheers,

Jakob

--
Jakob Schwendner, M.Sc.
Researcher

DFKI Bremen
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany

Phone: +49 (0)421 17845-4120
Fax: +49 (0)421 17845-4150
E-Mail: jakob.schwendner@xxxxxxx

Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/673/0060/3
-----------------------------------------------------------------------