[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] two things
- From: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx>
- Date: Thu, 26 Jun 2008 18:55:22 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=HJHM4v85gx2GN+bKCMggTycY/4nh4l/jR8jQyHsDBo8=; b=m47Thfzeh6nnbNiIZxzSBT2EYMc5KuNhW+WsWqLxS2GMe39gG2EVPxbVot0UfyDaop 3yEvsws1sHsVW661134OnsUMHXP/aETAITEDUqoAYZWDjrYRPFOHx8VHOYMdqC49rOT9 mQ5QWVyzKBvvYs/niCHp5S8LGh6BEFoZ4XwjY=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=fipdgX8b1roZCGRNmcGKeRxCqj5zA/FWKVZ+gaZm51CVKf3e3V5MXhX3y4KGUL4IXh g6tV1YXam9vgqVLAdBNnvLp22TiQiloO0ykKteNa/DGSttMOjh7f3VafzRLY1XWgKBxN V506OtEKG1wVA/stbfmTHfEV0nmS+e8d9+qsc=
On Thu, Jun 26, 2008 at 6:12 PM, Benoît Jacob <jacob@xxxxxxxxxxxxxxx> wrote:
> Here is the result here after revision 824739 introducing packet(int):
>
> 2.30471s 0.646553 GFlops
> 400 x 400 2.51917s 0.59151 GFlops
> 320 x 500 3.00975s 0.495097 GFlops
> 256 x 625 2.92007s 0.510302 GFlops
> 250 x 640 2.9007s 0.513709 GFlops
> 200 x 800 2.9415s 0.506583 GFlops
> 160 x 1000 2.90815s 0.512393 GFlops
> 128 x 1250 2.73835s 0.544166 GFlops
> 125 x 1280 2.889s 0.515789 GFlops
> 100 x 1600 2.92752s 0.509003 GFlops
> 80 x 2000 2.86383s 0.520322 GFlops
> 64 x 2500 2.9053s 0.512896 GFlops
> 50 x 3200 2.90992s 0.512081 GFlops
> 40 x 4000 2.90439s 0.513056 GFlops
> 32 x 5000 2.80648s 0.530955 GFlops
> 25 x 6400 2.85519s 0.521896 GFlops
> 20 x 8000 2.79833s 0.532503 GFlops
> 16 x 10000 2.8511s 0.522646 GFlops
> 10 x 16000 2.81542s 0.52927 GFlops
> 8 x 20000 2.80733s 0.530795 GFlops
> 5 x 32000 2.76623s 0.53868 GFlops
> 4 x 40000 2.85234s 0.522418 GFlops
> 2.80785s 0.530697 GFlops
>
> So, as expected, this problem is solved.
great !
>> hand coded vector with loop peeling:
>> 1.0101 sec 1.47521 GFlops
>>
>> VectorXf(400*400):
>> 1.50368 0.990978 GFlops
>
> So it would be much worth peeling loops. Now that we have a real linear path
> (and could also write a linear path in non-vectorized case) this will be much
> easier and more efficient.
yes, exactly. but I'm still puzzled by these results since on a 2GHz
core2 we could expect a peak performance of 8 GFlops and we are far
far away. I've also tried c = a + b; => even slower. On the other hand
with a += a; I could reach ~ 4.5 GFlops . For comparison purpose, our
optimized matrix product on 1024x1024 matrices achieve ~9 GFlops ! yes
9 ! this is because the CPU can does an "add" and a "mul" at the same
time... I guess the trick would be to do some prefetching but I did
not manage to get any improvements so far...
Gael.