[eigen] Eigen benchmark using atlas (was: Cannot compile benchmark) |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
Hello
10.06.2012 16:31, Dr. Michael Lehn написал:
Some time ago I sent an email
http://sourceforge.net/mailarchive/message.php?msg_id=28711667
to the ATLAS and GotoBLAS/OpenBLAS lists regarding the Eigen benchmark
results. I am kind of surprised that this topic did not lead to a longer
discussion. The main issue seems to be that the benchmarks are not taken
seriously. But what Clint replied sounds like a really good idea to me:
"... then I would say the thing to do is to install them and ATLAS, and
use ATLAS's benchmarking tools to compare the two on as many platforms as
interest you. As long as they provide a standard BLAS and LAPACK API, you
can tell ATLAS to build a full timing report comparing the two
implementations directly."
I just finished a comparison like that:
https://sourceforge.net/tracker/index.php?func=detail&aid=3540928&group_id=23725&atid=379483
The attached (to that support request) installation instructions install
gcc-4.7.1 into $HOME, use that to compile eigen-3.1.0 blas and test it
with atlas-3.9.84 testers.
The results are quite different from what eigen benchmark page shows
(http://eigen.tuxfamily.org/index.php?title=Benchmark), for example the
GEMM test in xsl3blastst.out shows that with matrices over 300x300 atlas
is 30 to 40 % faster:
----------------------------- GEMM ----------------------------------
TST# M N K ALPHA LDA LDB BETA LDC TIME MFLOP SpUp TEST
==== === ==== ==== ===== ==== ==== ===== ==== ===== ===== ==== =====
0 100 100 100 1.0 1000 1000 1.0 1000 0.00 14821.6 1.00 -----
0 100 100 100 1.0 1000 1000 1.0 1000 0.00 11683.4 0.79 PASS
....
9 1000 1000 1000 1.0 1000 1000 1.0 1000 0.12 16068.2 1.00 -----
9 1000 1000 1000 1.0 1000 1000 1.0 1000 0.09 22301.7 1.39 PASS
The first one of each test nr is eigen the second one is atlas. In
1000x1000 case I get about the same flops for eigen as on the benchmark
page (16k, assuming matrix matrix product is the same thing) but atlas
is way faster with 22 k.
Section 6 of the following file has more details on the output format:
http://projects.scipy.org/numpy/browser/vendor/src/atlas-3.8.3/doc/TestTime.txt?rev=7799
Ilja