Re: [eigen] Vectorized quaternion multiplication. |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Vectorized quaternion multiplication.
- From: Rohit Garg <rpg.314@xxxxxxxxx>
- Date: Sat, 7 Mar 2009 19:52:03 +0530
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=Unfw4ZIeyUnSZeoKiOkwE8jW2bLdzxDX5iytIwRQ3sQ=; b=og+CvFGW+HQETG/n1U2RuQJOTlq/Z7oCcjrP/Yc33lriOEh1sDgC0bfGPwGOsxhhCI fYEyXQMEWvQgJ6PM/o4wOwHva6yMk811ARLp196sa6Ovv6KIg46Ba2VToF0xsdK8uC2E MJRlDfN7x2UbRZhamp5v0TsdyfjwWiE7RYgGE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=r3MFwVlKvfq2xnQMLkjQsiRN9KR/Z976AhOVfaQUfbs96F/20KEcp2/nk81BhKjkjc ufFHHeoSuG3OEPzpBtVO2Pqx6/Tb86GdTQ0tBSONeJie5AZ0UfDaXjUG+ZwGIRus6Sih M0AmaTCjP2uWEQ7WHO+VECE1c7E1MAlnr0D5c=
nopes, doesn't. But I'll work through these issues soon. Hoping to
contribute regularly to Eigen project.
On Sat, Mar 7, 2009 at 7:43 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> perhaps this will work for you:
>
> svn co svn://websvn.kde.org:443/home/kde/trunk/kdesupport/eigen2
>
> gael
>
> On Sat, Mar 7, 2009 at 3:01 PM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
>> On Sat, Mar 7, 2009 at 7:23 PM, Gael Guennebaud
>> <gael.guennebaud@xxxxxxxxx> wrote:
>>> committed :)
>> Thanks a lot for that. :). The quat add and sub are vectorized now as
>> well, I presume. {Damn the svn block. :( }
>>
>>>
>>> On Sat, Mar 7, 2009 at 1:19 PM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
>>>> I think that the scalar version loses out because the multiplies are
>>>> not pipelined there.
>>>>
>>>> On Sat, Mar 7, 2009 at 5:30 PM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
>>>>> When you put it in, please tell me. :)
>>>>>
>>>>> I posted this for feedback here and you may be interested.
>>>>>
>>>>> http://forum.beyond3d.com/showthread.php?t=52840
>>>>>
>>>>> On Sat, Mar 7, 2009 at 5:16 PM, Gael Guennebaud
>>>>> <gael.guennebaud@xxxxxxxxx> wrote:
>>>>>> hi,
>>>>>>
>>>>>> thanks a lot,
>>>>>>
>>>>>> at a first glance I was not sure about the perf, because it needs a
>>>>>> lot of shuffle instructions which are quite costly, so benched, and on
>>>>>> my core2 your version is 1.5 times faster :) Then I changed the
>>>>>> shuffle_ps for the simpler PSHUFD instr. and now, it is almost 2x
>>>>>> faster, so really worth it :)
>>>>>>
>>>>>> FYI I only changed vec4f_swizzle like this:
>>>>>>
>>>>>> #define vec4f_swizzle(v,p,q,r,s) (_mm_castsi128_ps(_mm_shuffle_epi32(
>>>>>> _mm_castps_si128(v), \
>>>>>> ((s)<<6|(r)<<4|(q)<<2|(p)))))
>>>>>
>>>>> Now I see. This instruction takes one operand alone so is perhaps
>>>>> faster (aka higher throughput).
>>>>>
>>>>> --
>>>>> Rohit Garg
>>>>>
>>>>> http://rpg-314.blogspot.com/
>>>>>
>>>>> Senior Undergraduate
>>>>> Department of Physics
>>>>> Indian Institute of Technology
>>>>> Bombay
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Rohit Garg
>>>>
>>>> http://rpg-314.blogspot.com/
>>>>
>>>> Senior Undergraduate
>>>> Department of Physics
>>>> Indian Institute of Technology
>>>> Bombay
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Rohit Garg
>>
>> http://rpg-314.blogspot.com/
>>
>> Senior Undergraduate
>> Department of Physics
>> Indian Institute of Technology
>> Bombay
>>
>>
>>
>
>
>
--
Rohit Garg
http://rpg-314.blogspot.com/
Senior Undergraduate
Department of Physics
Indian Institute of Technology
Bombay