Re: [eigen] Patch for quaternion normalization and cross product for Vector4f |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Patch for quaternion normalization and cross product for Vector4f
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Tue, 10 Mar 2009 11:43:30 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=R4vvBUwZ2zIc7wQ9PZOCsNGmxUpRZvqZuxxDMnFhjLI=; b=SAnPHVtWAzNRPX0AwQv36Sc9fF3V0hmvbVhjYqBt7D+tlrC+G82CXyc50m7gO38EOI fOTKmMPIyeldLczdusS09J2oKqcsEK6Luz6rKD7SkM0fJEErmTDUwWJTZu7m3M2M3OpP zmvTl0+eqr9IP+fNQCMLICPl2O1VJU+QX70kc=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=Vf691ZrKIv36Q1MtDCq9rpD1CPf1Zu8jy4j+ahHXZI2mFgYjsPdz0XqanZEtKGJa/2 3KQx48LGrWndS+DM+WOjMSNe1rUhKV/vwAdwaLYMopxdCWG16DxEQ331ZPxb1x0BUJIF 3+u1v3BSIZvdxv6mdVWuASdvUjgMsYQ9H5um0=
hi,
for your information normalize was already automagically vectorized
because it basically compiled to something like:
q.coeffs() /= ei_sqrt(q.coeffs().cwise().abs2().sum());
and all these operators are vectorized by Eigen, regardless of the
scalar type (float, double, int) and size (must be Dynamic or a
multiple of the vector-size). With -msse3 and -ffast-math it compiles
to:
movaps (%rdi), %xmm1
movaps %xmm1, %xmm0
mulps %xmm1, %xmm0
haddps %xmm0, %xmm0
haddps %xmm0, %xmm0
sqrtss %xmm0, %xmm2
movss .LC1(%rip), %xmm0
divss %xmm2, %xmm0
shufps $0, %xmm0, %xmm0
mulps %xmm1, %xmm0
movaps %xmm0, (%rdi)
and with only -msse2:
movaps (%rdi), %xmm1
movaps %xmm1, %xmm2
mulps %xmm1, %xmm2
movaps %xmm2, %xmm0
movhlps %xmm2, %xmm0
addps %xmm2, %xmm0
movaps %xmm0, %xmm2
shufps $1, %xmm0, %xmm2
addss %xmm2, %xmm0
movaps %xmm0, %xmm2
movss .LC1(%rip), %xmm0
sqrtss %xmm2, %xmm2
divss %xmm2, %xmm0
shufps $0, %xmm0, %xmm0
mulps %xmm1, %xmm0
movaps %xmm0, (%rdi)
About the cross product, the perf boost is pretty low (x1.25), so I
don't know if that's really worth it. It is much better to pack your
data such that 4 cross products can be performed once.
Beyond that, yes "svn diff" is the right way to generate a patch.
cheers,
Gael.
On Tue, Mar 10, 2009 at 6:18 AM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
> Hi,
>
> Please find attached a ptch which implements quaternion normalization
> and cross products for Vector4f. Now obviously, it isn't defined for
> 4d vectors, but this one crosses it assuming it as a 3 component
> vector. The w component can be anything but will be set to zero after
> the product. As the patch mentions, It is mainly meant for those ppl
> who use vector4f instead of vector3f to use vectorization. I think
> this can be helpful to those as earlier cross product has to be done
> using vec3f.
>
> Quaternion normalization is obviously useful. and it is now vectorized.
>
> The patches are somewhat incomplete in the sense that I am not
> familiar with the C++ and the intrinsics bridge so much. I am just
> beginning to understand the guts of eigen and templates are even more
> confusing for me. You may have to write (small hopefully) pieces of
> glue code. On the intrinsics side, it should be complete, however. So,
> please indulge me for a while. I promise to send more complete patches
> in the future.
>
> Here's how it was generated.
>
> ~/eigen2@rpg-lab> svn update
> At revision 937616.
> ~/eigen2@rpg-lab> svn diff > rpg_patch
> ~/eigen2@rpg-lab> ls
> bench cmake CMakeLists.txt COPYING COPYING.LESSER
> CTestConfig.cmake demos disabled doc Doxyfile Eigen Mainpage.dox
> rpg_patch test unsupported
> ~/eigen2@rpg-lab>
>
> I hope that is the correct way to generate patches. If it isn't please
> tell me what is.
>
> Regards,
>
> --
> Rohit Garg
>
> http://rpg-314.blogspot.com/
>
> Senior Undergraduate
> Department of Physics
> Indian Institute of Technology
> Bombay
>