Re: [eigen] Vectorized quaternion multiplication. |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] Vectorized quaternion multiplication.*From*: Rohit Garg <rpg.314@xxxxxxxxx>*Date*: Sat, 7 Mar 2009 17:49:59 +0530*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=iULdOF+BgQ/cdYyC6/wQ5XYS6+qVccmgk9GusPO3E3k=; b=UcRFtWacvRsw4m8nurcf5nniF4AocLjn24DPpE9eaUsu53WsEOmvscLhFc6KA+0skw rZnGfv/O2jpJeyBDHEEsqAZGT/ZxwGsyesUvXYAvbNaYAlzoAjpkhfb01dLG5eHCGzEi RGOTeXgPUJ9b/b4TQISowQEFeP1lqvpzvhrqM=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=S68FUI75wJUPuIK9GO6O6qF/YN2g6Pr0y1Gk0v0TVRDoRb+IFAc7I/S2bJKLCQe93m iordtgVopLQSBw8xukqAGI8OyfLEk3AdHNBsmLdAmMoNTSEyfI5I8cSQxt42HVeEp05R i3yGEn6dxQYI76iMUZD7cua8p6PJilAST+0A0=

I think that the scalar version loses out because the multiplies are not pipelined there. On Sat, Mar 7, 2009 at 5:30 PM, Rohit Garg <rpg.314@xxxxxxxxx> wrote: > When you put it in, please tell me. :) > > I posted this for feedback here and you may be interested. > > http://forum.beyond3d.com/showthread.php?t=52840 > > On Sat, Mar 7, 2009 at 5:16 PM, Gael Guennebaud > <gael.guennebaud@xxxxxxxxx> wrote: >> hi, >> >> thanks a lot, >> >> at a first glance I was not sure about the perf, because it needs a >> lot of shuffle instructions which are quite costly, so benched, and on >> my core2 your version is 1.5 times faster :) Then I changed the >> shuffle_ps for the simpler PSHUFD instr. and now, it is almost 2x >> faster, so really worth it :) >> >> FYI I only changed vec4f_swizzle like this: >> >> #define vec4f_swizzle(v,p,q,r,s) (_mm_castsi128_ps(_mm_shuffle_epi32( >> _mm_castps_si128(v), \ >> ((s)<<6|(r)<<4|(q)<<2|(p))))) > > Now I see. This instruction takes one operand alone so is perhaps > faster (aka higher throughput). > > -- > Rohit Garg > > http://rpg-314.blogspot.com/ > > Senior Undergraduate > Department of Physics > Indian Institute of Technology > Bombay > -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay

**Follow-Ups**:**Re: [eigen] Vectorized quaternion multiplication.***From:*Gael Guennebaud

**References**:**[eigen] Vectorized quaternion multiplication.***From:*Rohit Garg

**Re: [eigen] Vectorized quaternion multiplication.***From:*Gael Guennebaud

**Re: [eigen] Vectorized quaternion multiplication.***From:*Rohit Garg

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] Vectorized quaternion multiplication.** - Next by Date:
**Re: [eigen] Vectorized quaternion multiplication.** - Previous by thread:
**Re: [eigen] Vectorized quaternion multiplication.** - Next by thread:
**Re: [eigen] Vectorized quaternion multiplication.**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |