Re: [eigen] Transform class performance and inconsistencies

[ Thread Index | Date Index | More Archives ]

On 17.12.2012 12:24, Jakob Schwendner wrote:
ok, thanks for confirming this. The other strange case, that you can see in my
benchmark is:

For the record, I can't reproduce this using g++-4.7.1 (on 32bit linux). I get the following timings (using Jacob's benchmark):

$ g++ benchGeometry.cpp -O2 -DNDEBUG -march=native -I .. -lrt  -mfpmath=sse && ./a.out
vec = trans * vec
float	 Isometry	 AutoAlign 3 0.0267s
float	 Isometry	 DontAlign 3 0.0284s
float	 Isometry	 AutoAlign 4 0.0189s
float	 Isometry	 DontAlign 4 0.0208s
float	 Projective	 AutoAlign 4 0.0111s
float	 Projective	 DontAlign 4 0.0318s
double	 Isometry	 AutoAlign 3 0.0257s
double	 Isometry	 DontAlign 3 0.0407s
double	 Isometry	 AutoAlign 4 0.0196s
double	 Isometry	 DontAlign 4 0.0206s
double	 Projective	 AutoAlign 4 0.0186s
double	 Projective	 DontAlign 4 0.0258s
vec = trans.matrix() * vec
float	 Isometry	 AutoAlign 4 0.0109s
float	 Isometry	 DontAlign 4 0.0172s
double	 Isometry	 AutoAlign 4 0.0186s
double	 Isometry	 DontAlign 4 0.0272s
trans = trans1 * trans
float	 Isometry	 AutoAlign   0.0619s
float	 Isometry	 DontAlign   0.0765s
double	 Isometry	 AutoAlign   0.0636s
double	 Isometry	 DontAlign   0.0955s
float	 Projective	 AutoAlign   0.0279s
float	 Projective	 DontAlign   0.1029s
double	 Projective	 AutoAlign   0.0565s
double	 Projective	 DontAlign   0.1035s
trans = trans1.matrix() * trans.matrix()
float	 Isometry	 AutoAlign   0.0246s
float	 Isometry	 DontAlign   0.0998s
double	 Isometry	 AutoAlign   0.0569s
double	 Isometry	 DontAlign   0.0941s

I also did not see any strange copies for the vector variants, the aligned variants all end with sth similar to
	mulps	%xmm2, %xmm0
	addps	%xmm1, %xmm0
	movaps	%xmm0, (%eax)
	ret	$4

However, all variants where the result does not fit into a register lead to a copy of the result (with aligned moves where possible, though), so either RVO is not optimal there, or not possible due to aliasing. And maybe that does not occur on 64bit machines, due to the higher number of registers.

Still, the timings show that e.g. aligned Isometry * X could be optimized to Projective * X in most cases (maybe unless the last line of the Isometry is not stored explicitly)

As for the non-compiling vec3 = Projective * vec3 from your first mail, I don't think this should be allowed, since for arbitrary Projective matrices you can't make any assumptions about the last component of the result.


Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: +49 (421) 218-64252

Mail converted by MHonArc 2.6.19+