Re: [eigen] a record for Eigen: 250 GFLOPS !!

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] a record for Eigen: 250 GFLOPS !!
From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
Date: Wed, 23 Jun 2010 07:29:03 -0400
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=UEJUvDeSf3+IpjPrl09wvsdWWDYczQxVdN+AwwYtKzA=; b=IvZYVAfNfV+Idi82GSmPmyed/v4q74Yx+0q7Dmm6Ox0GDcEro6EdhNiCUpGBsqq7z2 AqFaViDKV9BHSJ3qOf6FndkzExMZ9QkPs+cjR+dC6ENEvKOSdLw13V7a1Y0KjUBGCaTl k8f/c0QB+4CN9dO+fCMuibJYIZcbdXWTZQ/M4=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=QBuQvyUU2KSgbXKglhq9Ig/z8AMvVGU4pzUKz2j22t2f+SrL25XSlrqnLplhmDQrgE Pd8LunpFLqfW4z2HKonnoWa1m8yWRWKig5mizrExnQzFiX3m2dLKIXJX4MR65hwG1thj d4me8e5xCjsc+xlRUMu/v+jE6VP8RGBB7K0wk=

2010/6/23 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> Hi,
>
> this morning I played with a 48 cores AMD SMP server (8 processors
> AMD-Opteron-8439-SE, 6 cores each @ 2,8 GHz) and a bi-processor made
> of Intel X5570 @ 2.93GHz (4 multithreaded cores each => a total of 8
> cores, 16 threads), and here are the results for a product of 2048^2
> matrices of floats:

Very interesting!

>
> ** Intel **
>
> 16 threads (multi-threading)
> eigen real        0.158446s     108.427 GFLOPS  (2.22212s)
> mt speed up x5.55349 => 34.7093%
>
> 8 threads
> eigen real        0.125598s     136.785 GFLOPS  (1.2581s)
> mt speed up x7.0835 => 88.5438%
>
> 4 threads
> eigen real        0.228977s     75.0287 GFLOPS  (2.37034s)
> mt speed up x3.88544 => 97.136%
>
> 2 threads
> eigen real        0.449604s     38.2111 GFLOPS  (4.72754s)
> mt speed up x1.98317 => 99.1583%
>
> 1 thread
> eigen mono cpu    0.891639s     19.2677 GFLOPS  (8.9178s)
>
>
> a speed up factor of ~7 for 8 cores is a very nice scaling IMO.
>
>
> ** AMD **
>
>
> 1 thread
> eigen mono cpu    1.54084s      11.1496 GFLOPS  (15.4136s)
>
> 2 threads
> eigen real        0.817967s     21.0031 GFLOPS  (8.18607s)
> mt speed up x1.88375 => 94.1874%
>
> 4 threads
> eigen real        0.41879s      41.0226 GFLOPS  (4.1911s)
> mt speed up x3.73174 => 93.2936%
>
> 8 threads
> eigen real        0.214083s     80.2485 GFLOPS  (2.15697s)
> mt speed up x7.49282 => 93.6602%
>
> 16 threads
> eigen real        0.115521s     148.716 GFLOPS  (1.26385s)
> mt speed up x13.4568 => 84.1048%
>
> 24 threads
> eigen real        0.168208s     102.135 GFLOPS  (1.75357s)
> mt speed up x9.55177 => 39.7991%
>
> 32 threads
> eigen real        0.0686023s    250.427 GFLOPS  (1.19708s)
> mt speed up x23.001 => 71.8781%
>
> 42 threads
> eigen real        0.0799503s    214.882 GFLOPS  (0.938163s)
> mt speed up x19.9015 => 47.3844%
>
> 48 threads
> eigen real        0.143299s     119.888 GFLOPS  (1.62653s)
> mt speed up x11.2097 => 23.3536%
>
>
> We can see that AMD's SSE implementation is half the speed of Intel's
> one.

Is it because MULPS and ADDPS don't pipeline as they do with Intel CPUs?

> This architecture seems to be tricky to control because the peak
> performance is obtained with 32 threads with a speed up factor of x23
> that is not bad. With more threads the perf significantly drops down.
> There is also a slow down with 24 threads.

Also, this is "just" 2048x2048, so there is only work to do for a
finite number of threads... as is also seen in the fact that this job
was completed in less than a tenth of a second. I wonder if a larger
matrix product would scale better to large numbers of threads.

Benoit

>
> that's all folks.
>
> gael
>
>
>

Follow-Ups:
- Re: [eigen] a record for Eigen: 250 GFLOPS !!
  - From: Gael Guennebaud

References:
- [eigen] a record for Eigen: 250 GFLOPS !!
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] a record for Eigen: 250 GFLOPS !!
Next by Date: Re: [eigen] a record for Eigen: 250 GFLOPS !!
Previous by thread: Re: [eigen] a record for Eigen: 250 GFLOPS !!
Next by thread: Re: [eigen] a record for Eigen: 250 GFLOPS !!

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/