Re: [eigen] Re: sse4 and integer multiplication |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Re: sse4 and integer multiplication
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Tue, 24 Nov 2009 16:01:04 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=trLQDPXBfgs4WsZ8pSiBxJ8tMdpvbyE5/1ChSo0TMVg=; b=EhutHoqyKArwCl141kU7fw48CSyg5QbxywI4L6x2ucHQexZVMXRkklvzpJ6FFdiOPx /a737bVgMDCthivB+4laZOEFNEvVdJ45C10J+r4gALnMYbEms3F9L0UMMiWAqq4RLVMo V5mZujob+KLPEeJJLa9/nXS7b4tmoh2jljK8I=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=nN1IcnYkOFO09FOroNfaz//OGT5SmqXn702nsHDFGfZfbrAlItnqn67bqE38EdehZo oipNG2I0gjYxK03pt4LYjso+KidwrK1wycWI2ieZG3bsl9GNPscn5qgcAgSBWaj4dp0F MqXaI8BpZXsacslSFTgR0EgHWt5T/ntULHYKc=
2009/11/24 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> in the SSE4 version you have 2 unnecessary moves, one useless load, and one
> useless store. That's the main reason. Now why GCC does not optimize them
> away, well I've no clue...
OK, i tried a different benchmark, this time there's an addition and
it can't keep everything in registers,
#include <Eigen/Dense>
using namespace Eigen;
using namespace std;
EIGEN_DONT_INLINE int foo(VectorXi& w)
{
VectorXi v = VectorXi::Random(1000);
EIGEN_ASM_COMMENT("begin");
v += (v.cwise()*v).cwise()*w;
EIGEN_ASM_COMMENT("end");
return v(ei_random<int>(0,999));
}
int main()
{
VectorXi w = VectorXi::Random(1000);
for(int i = 0; i<100000; i++) foo(w);
}
Non-vectorized: 1.91 s
SSE4.1: 2.41 s
so this time it's 26% faster...
Cheers to Intel's marketing dept.
Benoit