Re: [eigen] Vectorized(SSE) integer multiplication

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Sun, Mar 8, 2009 at 7:02 AM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
> Hi,
>
> This file has my vectorized implementation (sse2) of multiplication of
> 4 integers. The eigen routine was taken from packetMath.h file. The
> benchmarks show small but noticeable difference.
>
> ~/Documents/numerical@rpg> g++ vec4i_mul.cpp -msse3 -O3 -march=native
> ~/Documents/numerical@rpg> ./a.out > /dev/null
> 1236491601 ei mul begins
> 1236491618 ei mul ends
> 1236491618my mul
> 1236491633 end
>
> The macros could be defined better I admit. They were taken from my
> implementation of vec4i multiplication which I wrote for my own needs
> earlier. They are same as for the quaternion routine I sent earlier.
> So please consider unifying them.

thanks, actually I expected the bitwise ops being faster than
shuffles; but that's not the case. I committed your change.

> BTW, this multiplication instruction that you (and I) are using does
> only unsigned multiplication. Signed multiplication is there as a
> single instruction in SSE4.1. So a small patch could be added for that
> too. the exact intrinsic is _mm_mul_epi32. My cpu doesn't have that,
> so I can't test it.

actually, here it works for negative integers too... and like, you, my
CPU does not support SSE4.1 so cannot try...

> Regards,
> --
> Rohit Garg
>
> http://rpg-314.blogspot.com/
>
> Senior Undergraduate
> Department of Physics
> Indian Institute of Technology
> Bombay
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/