Re: [eigen] Eigen benchmark using atlas (was: Cannot compile benchmark) |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Eigen benchmark using atlas (was: Cannot compile benchmark)
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Sun, 8 Jul 2012 13:38:59 +0200
- Cc: "Dr. Michael Lehn" <michael.lehn@xxxxxxxxxx>
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=YetbQ7fjg8HHah/Si7QUShJ1dEts6WkartZx4jqgexA=; b=MKbW1fYne3tHKDZbzj8vxo4WjJwrxU+R3xPAJh3MJOi3Gao+e3+rI73I+REqIKWHPr /bQKbCUjJQOQTQp+/JFJFEd5akVAMR7Z7wJBGMpbliUabHEth89OG86aO9zN3bY0VdA2 2T0NLp7Gwof8xXYMJLXbChz+nznNBm179iXE39PX9ZYxXh9MZmTLHULnxC2cTKdwLBNt X7957JCe/6GVzG6aROfmiIoQSd5sj+USE3gqxjcUSeYz15kGWkxaHoELKYx0Wb2DEYFG RAzcqyvME0ph8tgmqjLfFHKk5f91YRVEFghGRVHM68w5T3F8Lg628kAqtkee8A+GJkce woaQ==
Hi,
I see you have a 2.88 GHz CPU, so the peak performance for sgemm is
about 23GFLOPS, and Eigen should really be close to this limit. Do you
have a 64bits system? Make sure to compile eigen in release mode:
cmake -DCMAKE_BUILD_TYPE=Release .
If you still get such slow performance, then maybe it is because Egen
requires specific tuning for AMD processors. Unfortunately, I don't
have any such CPU around me.
Also ATLAS benchmark only test matrix sizes that are multiple of 4.
Actually, I found that ATLAS performance drops down with odd sizes.
For instance, I just compiled the latest 3.9.84 version of atlas (the
benchmark was made on 3.8.3), and on my Intel CPU, I get:
m=n=k=1116
Eigen: 23.13 GFLOPS
Atlas: 23.167 GFLOPS
m=n=k=1115
Eigen: 23.5 GFLOPS
Atlas: 19.15 GFLOPS
gael.
On Sat, Jul 7, 2012 at 7:15 PM, Ilja Honkonen <ilja.honkonen@xxxxxxxxxxx> wrote:
> Hello
>
> 10.06.2012 16:31, Dr. Michael Lehn написал:
>>
>> Some time ago I sent an email
>> http://sourceforge.net/mailarchive/message.php?msg_id=28711667
>> to the ATLAS and GotoBLAS/OpenBLAS lists regarding the Eigen benchmark
>> results. I am kind of surprised that this topic did not lead to a longer
>> discussion. The main issue seems to be that the benchmarks are not taken
>> seriously. But what Clint replied sounds like a really good idea to me:
>> "... then I would say the thing to do is to install them and ATLAS, and
>> use ATLAS's benchmarking tools to compare the two on as many platforms as
>> interest you. As long as they provide a standard BLAS and LAPACK API, you
>> can tell ATLAS to build a full timing report comparing the two
>> implementations directly."
>
>
> I just finished a comparison like that:
> https://sourceforge.net/tracker/index.php?func=detail&aid=3540928&group_id=23725&atid=379483
>
> The attached (to that support request) installation instructions install
> gcc-4.7.1 into $HOME, use that to compile eigen-3.1.0 blas and test it with
> atlas-3.9.84 testers.
>
> The results are quite different from what eigen benchmark page shows
> (http://eigen.tuxfamily.org/index.php?title=Benchmark), for example the GEMM
> test in xsl3blastst.out shows that with matrices over 300x300 atlas is 30 to
> 40 % faster:
>
> ----------------------------- GEMM ----------------------------------
> TST# M N K ALPHA LDA LDB BETA LDC TIME MFLOP SpUp TEST
> ==== === ==== ==== ===== ==== ==== ===== ==== ===== ===== ==== =====
> 0 100 100 100 1.0 1000 1000 1.0 1000 0.00 14821.6 1.00 -----
> 0 100 100 100 1.0 1000 1000 1.0 1000 0.00 11683.4 0.79 PASS
> ...
> 9 1000 1000 1000 1.0 1000 1000 1.0 1000 0.12 16068.2 1.00 -----
> 9 1000 1000 1000 1.0 1000 1000 1.0 1000 0.09 22301.7 1.39 PASS
>
> The first one of each test nr is eigen the second one is atlas. In 1000x1000
> case I get about the same flops for eigen as on the benchmark page (16k,
> assuming matrix matrix product is the same thing) but atlas is way faster
> with 22 k.
>
> Section 6 of the following file has more details on the output format:
> http://projects.scipy.org/numpy/browser/vendor/src/atlas-3.8.3/doc/TestTime.txt?rev=7799
>
> Ilja
>
>