Re: [eigen] geometry module |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] geometry module
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Wed, 8 Sep 2010 14:04:11 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=Tg7G0ednQJLgRZmuyZ1gsRCd9Kmv2urg3IlY8TGnKRI=; b=s2BbA8ojSIsdRXJSHz/C+TTwYQURDiNO+W3nhsxJ5jKYNoK1qhgIigFAjEayK5rZdl g1JOINXeexwYJq9cmGJOhL7EbHUYFEbjTMk/o5W5eV0wYvWvqqmM6uxUmVAkavibmJsE NLrWppcANp+pTlxeAteJCEcpTLlXr2gPrFOqU=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=H955wfvJXZonadPYrc9Z9StX/ZBUhQdthTccgCDEyCaWO8DXeyjhE5hAQcx8buSh3f G8a9+Hp+zr3Oqnoa5YGynwzFBNe0j5XFW0BiRTgvZUZ3yG4Wc+1FZ5DoG0S5xRWVNKxZ F/dgOX312fP7dPW5Ld054F/NgG7UM15LlVCWU=
On Wed, Sep 8, 2010 at 1:18 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2010/9/8 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>> On Tue, Sep 7, 2010 at 12:57 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>> Here, "most efficiently" depends on what you're doing. If you want to
>>> apply this transformation to a vector, it's going to be faster if you
>>> have a matrix representation of your transform, as the Transform class
>>> does. This is one of the most performance-critical use cases...
>>
>> some numbers to transform N 3D vectors stored into a 3xN column major
>> matrix and transformed using a 3x3 matrix, a quaternion using the
>> quaternion x a single vector product, and a quaternion converted on
>> the fly to a 3x3 matrix. The times are in second for 100000 runs (in
>> the last case the quaternion is converted 100000 times to a matrix).
>>
>> N 1 2 3 4 5 6
>> 7 8
>> matrix 3x3 0.0007521 0.0008807 0.001357 0.002339 0.002869 0.003583
>> 0.004301 0.02684
>> quaternion 0.001332 0.002183 0.003098 0.004002 0.004913 0.005945
>> 0.007081 0.007997
>> quat-mat 0.001165 0.00152 0.001822 0.002925 0.003396 0.003964
>> 0.004615 0.02727
>>
>> as expected the matrix product is significantly faster, but what is
>> surprising is that even for transforming a single vector (N=1), it is
>> faster to convert the quaternion to a matrix and then perform the
>> matrix product rather than directly using the optimized
>> quaternion-vector product since the costs are respectively:
>>
>> 3x3 matrix : 9 mul + 6 add = 15 ops
>> quaternion : 15 mul + 15 add = 30 ops
>> quat-mat : 18 mul + 21 add = 39 ops
>>
>> These numbers directly come from the assembly where we can see gcc
>> optimized the "2 * v" by "v+v".
>>
>> also Daniel you might be interested to know that this benchmark is in
>> bench/quaternion.cpp (in trunk).
>
> Thanks a lot for these numbers!
>
> Do you think that quaternion*vector3D has room to be improved by
> copying the vector3d into a vector4d and applying the vectorizable
> quaternion*vector4D product? I am worried about the 4th component: if
> it would be required to divide by it, that could kill the benefit.
It is even worse. I've simply tried to copy the input into a vector4f
and used the vectorized cross3 function. The result is 5 pmul and 5
padd only:
N 1
matrix 3x3 0.0007607
quaternion 0.002226
quat-mat 0.001178
The problem is not the copy which are well optimized away by gcc, but
the extra 5 shuffling. Maybe some shuffling can be removed by directly
vectorizing the quaternion * vector product (we currently vectorize
quat*quat only).
gael
> Benoit
>
>
>>
>>
>> gael
>>
>>
>>
>
>
>