Re: [eigen] Transform class performance and inconsistencies

[ Thread Index | Date Index | More Archives ]

ok, thanks for confirming this. The other strange case, that you can see in my benchmark is:
trans = trans1 * trans
float          Isometry        AutoAlign   0.0260s  
float          Isometry        DontAlign   0.0207s  
double     Isometry        AutoAlign   0.0202s  
double     Isometry        DontAlign   0.0207s  
float          Projective     AutoAlign   0.0109s  
float          Projective     DontAlign   0.0386s  
double     Projective     AutoAlign   0.0162s  
double     Projective     DontAlign   0.0407s
In the float case for Isometry (this should be very common) the AutoAlign is slower than the DontAlign.
And both of them are much slower compared to setting Projective. I could see in the code, that the Isometry product
has a separate codepath using just the affine part, only in this case I think it prevents it from using vectorization,
which would be much faster.

On December 17, 2012 at 11:35 AM Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:
ok, thank you.

I implemented my own too, and I get the following results ('u' stands for unaligned):
Vec3 /Iso3 : 0.000645647
Vec3u/Iso3u: 0.000645586
Vec4 /Iso3 : 0.00126092
Vec4u/Iso3u: 0.000688231
Vec4 /Mat4 : 0.000391356
Vec4u/Mat4u: 0.000981292
Vec3 /Iso3 : 0.000649383
Vec3u/Iso3u: 0.000649382
Vec4 /Iso3 : 0.00126092
Vec4u/Iso3u: 0.000689219
Vec4 /Mat4 : 0.000475642
Vec4u/Mat4u: 0.000979767
obtained with gcc 4.6, -O2 -DNDEBUG.
The only strange result is the case "Vec4 /Iso3" which is unexpectedly twice as slow as "Vec4u/Iso3u". Looking at the assembly, the only difference between the two versions is that in the "Vec4 /Iso3" case the result is first copied into a temporary (one coefficient at a time), that is then copied using 2 movaps to the true result location. Clearly, the compiler should be able to remove this extra copy as it is done  with the "Vec4u/Iso3u" case, but here it does not, I do not know why.

On Mon, Dec 17, 2012 at 10:07 AM, Jakob Schwendner <jakob.schwendner@xxxxxxx> wrote:
So I cleaned up the benchmarking that I did a little bit, and added a test to the bench folder, which should be consistent with the rest of the benchmarks there:
I ran the benchmark results on two different systems:
And still get better results for the DontAlign in situation where I wouldn't expect it. Might be my benchmarking, though...

Jakob Schwendner, M.Sc.

DFKI Bremen
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany

Phone: +49 (0)421 17845-4120
Fax: +49 (0)421 17845-4150
E-Mail: jakob.schwendner@xxxxxxx

Weitere Informationen:
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/673/0060/3

Mail converted by MHonArc 2.6.19+