Re: [eigen] Re: sse4 and integer multiplication |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Re: sse4 and integer multiplication
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Tue, 24 Nov 2009 16:12:17 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=x8E50qCsLGT7DQex5+MjLn7M88RJS/T1udx+aRcUvFA=; b=mJLiJNX3fi9DNG1YDr75oOOvyfLJ9i3zw1GHFfJ77PE7EtFcUeq49jq/SsrxK4OhHL o7aaA77ehE5w1fJYfdDFaE4BQE1uufKNpDEMyBKgSSrNKapDXWR52BtFJaiGqj948z40 4nQcONJbAqL3VDYzKUmqAB7zqEGw2UDIy4dOE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=W7m2Bg7ulJXORhcZ/mv6IlcZ3A6nlPpe69/q+UA8zxosg+kfOds9aZ3jI0RDy7lFho LRl+UwUBxdNxhKt1anr2WyhLl4nSP+L+EJ3IfDSpyY/l1Ee3cy2wQ+dBQhIXjfrHjOPd qn11eTyVaIfHPWzNtxKoXi1WmgscjViPV5hvE=
ok, last email, i promise.
my last benchmark was stupid: by constantly reallocating v it
prevented it from being cached, making the whole thing memory-bound.
Plus the time spent waiting for malloc.
New benchmark:
#include <Eigen/Dense>
using namespace Eigen;
using namespace std;
EIGEN_DONT_INLINE int foo(VectorXi& v, VectorXi& w)
{
EIGEN_ASM_COMMENT("begin");
v += (v.cwise()*v).cwise()*w;
EIGEN_ASM_COMMENT("end");
return v(ei_random<int>(0,999));
}
int main()
{
VectorXi v = VectorXi::Random(1000);
VectorXi w = VectorXi::Random(1000);
for(int i = 0; i<1000000; i++) foo(v,w);
}
No vec: 6.797s
SSE4.1 1.819s
SSE2 2.024s
So SSE is faster.... but when one has clever guys like Rohit on the
team, SSE2 is good enough ;)
Benoit
2009/11/24 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> 2009/11/24 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> Non-vectorized: 1.91 s
>> SSE4.1: 2.41 s
>
> oops, i meant the reverse:
>
> Non-vectorized: 2.41 s
> SSE4.1: 1.91 s
>
>>
>> so this time it's 26% faster...
>
> so yes this time sse4.1 is faster than nothing....
>
>> Cheers to Intel's marketing dept.
>
> ... but not as fast as Intel would have you believe, that's what i meant.
>
>
>> Benoit
>>
>