Re: [eigen] Patch for quaternion normalization and cross product for Vector4f |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Patch for quaternion normalization and cross product for Vector4f
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Wed, 11 Mar 2009 13:28:51 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=sxv/VsQHX4I6/GQyeMhRgqwSObw3PoEb7jrpj6cw3fQ=; b=WDDO+/ELAZzB+O7/jtqJKBgNiTKzNFJClpjZC5phxf3wnJcOH4Yo7zTKLQPhdBQr54 nBSu9ZeeK+SXQTlLxmFWL5tplHibSWtYyFIK7LlGet2Epez4lW8LV5zFG+vEaa5+hSqb MEN2EjJASAsMOIpq3sh335P0z2UoepuB1o21A=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=GoyxOjrbn33VgSXe5urzCvmOc1Fj/V3XLXUt1KsxdbEIWC2WUFMUY0es1eCE5LctdS d8GlO+kRPBCSlmNQxrL7JqkdJ7ILl4nwb5GleS7wqdLJsZILCpI+GgAxKmDFMFcckQVg ioOPkzhpfXDRgqZ9yfSXPGT0YzZjr1VQWaXmY=
On Wed, Mar 11, 2009 at 12:58 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2009/3/11 Rohit Garg <rpg.314@xxxxxxxxx>:
>>> yes that's true that AoS is more convenient, and I agree it might be
>>> handy to have an optimized cross3. Honestly, when dealing with such
>>> small vectors, if your code includes many dot/cross products,
>>> normalizations, etc... then I doubt the speed up could be above x1.5.
>>> Significant speedup can only come from a more high level
>>> reorganization of your algorithm, and using complex memory layouts...
>>
>> Yes, but the problem is that many api's (like opengl for instance)
>> like AoS input form. And 1.5x is any day better than 1x. :) And we
>> _do_ have vectorized dot's, add, mul, normalize, subtract etc. for
>> small vectors in eigen, don't we?
>
> For fixed-size vectors, we have vectorization only if we can have it
> without runtime overhead, which is only if we can align the array
> without wasting memory, which is only if the size is a multiple of 16
> bytes. So Vector4f yes, but Vector3f no.
>
> For dynamic sizes (typically larger), we have vectorization regardless
> of the size.
>
>>
>> Bottom line, what is the reason for such a function not being part of
>> the eigen's api when almost all the family of vector ops is? After
>> all, it only improves speed. :/ And it's an opt-in feature, not an
>> opt-out one.
>
> About the cross product:
> I can't make myself a strong opinion either way. I was concerned of
> maintainance and porting to other archs, but it seems to factor well
> through our wrappers. My main concern now is that it's going to be
> little used. On the other hand, it could be the first of a collection
> of functions working on 3d vectors stored as 4d vectors for
> vectorization; I would be OK to start a new subdirectory of
> unsupported/ containing such functions. After all there is a demand
> for such functions, but they fit poorly with the rest of Eigen (until
> perhaps we find a better API -- one possibility might be to have a
> variant of the Matrix class itself doing storage differently and
> reimplementing these methods) and put a much bigger burden of the user
> to take care of low-level aspects, but in unsupported/ that's ok.
>
> Your patch also contains a vectorized quaternion product. That would
> be extremely useful, I wonder why you dont mention it in your email?
> Do you consider it unfinished? I may also be out of date if Gael
> vectorized it already since the last time I checked.....
actually I've already applied the quaternion product patch.
about vector3 ops using aligned vector4, currently I would say that's
up to the user to correctly use vector4 to ensure the last coeff is
zero when it really as to be 0 (dot/norm) and uses .start<3> when
there is no alternative (minCoeff). Am I right if I say that in that
context the only missing feature is a vectorized cross product ? If
so, then the easiest way is to add a cross3() function. Another option
would be to check the MaxSizeAtCompileTime value to automatically
vectorize:
a.start<3>().cross(b.start<3>()), the problem is the return type which
have to be a Vector4.
Another related thing: we already can do:
typedef Matrix<float,3,1,0,4,1> AlignedVector3;
so if we improve the vectorization to better take into account the
Max{Row|Cols|Size}AtCompileTime values, then all coeff wise ops could
be easily vectorized. Likewise, the cross product could automatically
return an "AlignedVector3". The only problems I see are:
1 - the dot/norm stuff (we would need specialized versions of ei_predux)
2 - the evaluation to a temporary if the Max*AtCompileTime values are
not taken into account (I don't remember how it behaves)
gael
> Cheers,
> Benoit
>
>>
>>> Regarding Bullet SDK, actually only some specific parts of the SDK are
>>> vectorized by hand (using directly inline assembly and/or using
>>> directly intrinsics) and there the base classes (matrix, vector) are
>>> not vectorized at all.
>>>
>>> cheers,
>>> Gael.
>>>
>>>
>>>
>>
>>
>>
>> --
>> Rohit Garg
>>
>> http://rpg-314.blogspot.com/
>>
>> Senior Undergraduate
>> Department of Physics
>> Indian Institute of Technology
>> Bombay
>>
>>
>>
>
>
>