Re: [eigen] Rigid transformations in eigen: use of dual quaternions

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Sat, Sep 12, 2009 at 10:34 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2009/9/12 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>>
>> hi,
>>
>> I don't have much right now to address all the issues raised in that
>> thread, but at least:
>>
>> - here we are speaking about 4 scalars only, and so there is no
>> advantage in using expression template (wrt performance). So here
>> returning by value is fine.
>
> really? for simple enough expressions, when there are enough
> registers, OK, but for example, without vectorization, there won't be
> enough registers on x86 to do s1*q1+s2*q2 efficiently, right?

yes, I tried for you and it's even a bit faster.

here is the asm with ET:

	movss	(%rsi), %xmm2
	movss	(%rdi), %xmm3
	mulss	%xmm1, %xmm2
	mulss	%xmm0, %xmm3
	addss	%xmm3, %xmm2
	movss	%xmm2, (%rdx)
	movss	4(%rsi), %xmm2
	movss	4(%rdi), %xmm3
	mulss	%xmm1, %xmm2
	mulss	%xmm0, %xmm3
	addss	%xmm3, %xmm2
	movss	%xmm2, 4(%rdx)
	movss	8(%rsi), %xmm2
	movss	8(%rdi), %xmm3
	mulss	%xmm1, %xmm2
	mulss	%xmm0, %xmm3
	addss	%xmm3, %xmm2
	movss	%xmm2, 8(%rdx)
	mulss	12(%rsi), %xmm1
	mulss	12(%rdi), %xmm0
	addss	%xmm0, %xmm1
	movss	%xmm1, 12(%rdx)

and now with return by value:

	movss	4(%rsi), %xmm4
	movss	4(%rdi), %xmm2
	mulss	%xmm1, %xmm4
	mulss	%xmm0, %xmm2
	movss	8(%rsi), %xmm3
	mulss	%xmm1, %xmm3
	movss	12(%rdi), %xmm5
	mulss	%xmm0, %xmm5
	addss	%xmm2, %xmm4
	movss	8(%rdi), %xmm2
	mulss	%xmm0, %xmm2
	mulss	(%rdi), %xmm0
	addss	%xmm2, %xmm3
	movss	12(%rsi), %xmm2
	mulss	%xmm1, %xmm2
	mulss	(%rsi), %xmm1
	movss	%xmm4, 4(%rdx)
	movss	%xmm3, 8(%rdx)
	addss	%xmm5, %xmm2
	addss	%xmm0, %xmm1
	movss	%xmm2, 12(%rdx)
	movss	%xmm1, (%rdx)

with gcc 4.4.

>>
>> - I'm not in favor in adding operator+ and scalar multiple to the
>> Quaternion class for the same reason than Benoit.
>>
>> - Quaternion::operator=(MatrixBase<>) is to convert a rotation matrix
>> to a quaternion, so quat = q1.coeffs() + q2.coeffs(); won't work.
>
> oops, i forgot about that. How about a meta selector to give a
> different behavior depending on IsVectorAtCompileTime ?
>
>>
>> - I don't really llike the idea of adding operator() as shortcut for
>> coeffs(), because, e.g.,  q1() looks very weird.Currently, the only
>> use case is to write the DualQuaternion class, and here it is
>> perfectly fine to use .coeffs().
>
> OK
>
> Benoit
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/