Re: [eigen] Transform class performance and inconsistencies |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
On 17.12.2012 12:24, Jakob Schwendner wrote:
ok, thanks for confirming this. The other strange case, that you can see in my
benchmark is:
For the record, I can't reproduce this using g++-4.7.1 (on 32bit linux).
I get the following timings (using Jacob's benchmark):
$ g++ benchGeometry.cpp -O2 -DNDEBUG -march=native -I .. -lrt -mfpmath=sse && ./a.out
vec = trans * vec
float Isometry AutoAlign 3 0.0267s
float Isometry DontAlign 3 0.0284s
float Isometry AutoAlign 4 0.0189s
float Isometry DontAlign 4 0.0208s
float Projective AutoAlign 4 0.0111s
float Projective DontAlign 4 0.0318s
double Isometry AutoAlign 3 0.0257s
double Isometry DontAlign 3 0.0407s
double Isometry AutoAlign 4 0.0196s
double Isometry DontAlign 4 0.0206s
double Projective AutoAlign 4 0.0186s
double Projective DontAlign 4 0.0258s
vec = trans.matrix() * vec
float Isometry AutoAlign 4 0.0109s
float Isometry DontAlign 4 0.0172s
double Isometry AutoAlign 4 0.0186s
double Isometry DontAlign 4 0.0272s
trans = trans1 * trans
float Isometry AutoAlign 0.0619s
float Isometry DontAlign 0.0765s
double Isometry AutoAlign 0.0636s
double Isometry DontAlign 0.0955s
float Projective AutoAlign 0.0279s
float Projective DontAlign 0.1029s
double Projective AutoAlign 0.0565s
double Projective DontAlign 0.1035s
trans = trans1.matrix() * trans.matrix()
float Isometry AutoAlign 0.0246s
float Isometry DontAlign 0.0998s
double Isometry AutoAlign 0.0569s
double Isometry DontAlign 0.0941s
I also did not see any strange copies for the vector variants, the
aligned variants all end with sth similar to
mulps %xmm2, %xmm0
addps %xmm1, %xmm0
movaps %xmm0, (%eax)
ret $4
However, all variants where the result does not fit into a register lead
to a copy of the result (with aligned moves where possible, though), so
either RVO is not optimal there, or not possible due to aliasing. And
maybe that does not occur on 64bit machines, due to the higher number of
registers.
Still, the timings show that e.g. aligned Isometry * X could be
optimized to Projective * X in most cases (maybe unless the last line of
the Isometry is not stored explicitly)
As for the non-compiling vec3 = Projective * vec3 from your first mail,
I don't think this should be allowed, since for arbitrary Projective
matrices you can't make any assumptions about the last component of the
result.
Regards,
Christoph
--
----------------------------------------------
Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen
Tel: +49 (421) 218-64252
----------------------------------------------