[eigen] Transform class performance and inconsistencies |

[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]

Hi,

`I was curious about the performance of the Transform class in the
``Geometry module for different cases and did some benchmarking. While
``doing that, I noticed some things that I certainly would not have
``expected. What I wanted to find out was the performance of a Transform *
``Vector and Transform * Transform operation for both float and double as
``well as aligned and unaligned.
`
First case:
typedef Transform<float, 3, Isometry, AutoAlign> Trans;
typedef Matrix<float, 3, 1, AutoAlign> Vec;
Vec v = Vec::Zero(); Trans t = Trans::Identity();
loop 100000000 (using function call in separate compilation unit):
Vec res = t * v;
will result in 760 kcycles. The same code, but with
typedef Transform<float, 3, Isometry, DontAlign> Trans;
typedef Matrix<float, 3, 1, DontAlign> Vec;

`takes 540 kcycles. I have no Idea why this would happen, since Vector3d
``is not an alignable type, right?
`
Second case:
Same as first case, but now I use
typedef Matrix<float, 4, 1, AutoAlign> Vec;
and
typedef Matrix<float, 4, 1, DontAlign> Vec;

`and that gets me 850 for the aligned and 560 kcycles for the unaligned
``case. Here I would have expected to get a peformance boost since I
``thought Vector4f is vectorizable.
`
Third case:
Again using
typedef Matrix<float, 4, 1, AutoAlign> Vec;
typedef Matrix<float, 4, 1, DontAlign> Vec;
but now I use
Vec res = t.matrix() * v;

`This results in 410 for the aligned and 1080 for the unaligned case.
``This is more what I would have expected, and I guess the performance
``penalty comes from the fact the the transform is tagged as an Isometry,
``which means it doesn have to perform the full matrix product, but only
``the affine part in the unaligned case. The aligned case makes up for it
``using vectorization. Note: the performance gain can be seen for both
``float and double (I leave out the number to not add to the confusion)
`
Fourth case:
This time peforming a transform * transform product
typedef Transform<float, 3, Isometry, AutoAlign> Trans;
Trans t1, t2;
Trans res = t1 * t2
gives 1850 for aligned and 1730 for unaligned.
when I do
Trans res = t1.matrix() * t2.matrix()

`its 1150 for aligned and 3620 for unaligned. So an increase in the
``aligned case. Notably though is that this increase is not visible for
``double.
`
Fifth case:
Switching to projective transform
typedef Transform<float, 3, Projective, AutoAlign> Trans;
typedef Matrix<float, 3, 1, AutoAlign> Vec;
Vec v = Vec::Zero(); Trans t = Trans::Identity();
Vec res = t * v;

`actually results in a compile error. This is very unexpected. to get it
``to compile:
`Vec res = (t * v.homogeneous()).head<3>();

`is required. When Transform is set to Projective, the cases with the
``obvious alignment are faster than the unaligned ones (again not for
``transform * transform in the double case).
`

`A lot of this I did not expect based on the documentation... Maybe
``someone can enlighten me?
``Test environment: GCC 4.6.3 latest eigen head. Core i7 CPU. Tests
``compiled with -O3
`
cheers,
Jakob