Re: [eigen] Eigen benchmark using atlas

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


I see you have a 2.88 GHz CPU, so the peak performance for sgemm is
about 23GFLOPS, and Eigen should really be close to this limit. Do you
have a 64bits system? Make sure to compile eigen in release mode:
cmake -DCMAKE_BUILD_TYPE=Release .

Linux reports the maximum as 3000 MHz, it is 64 bit (AMD Phenom II X6 1075T).

If you still get such slow performance, then maybe it is because Egen
requires specific tuning for AMD processors. Unfortunately, I don't
have any such CPU around me.
m=n=k=1116
Eigen: 23.13 GFLOPS
Atlas: 23.167 GFLOPS
m=n=k=1115
Eigen: 23.5 GFLOPS
Atlas: 19.15 GFLOPS

I rebuilt eigen with ...=Release and for 1111-1120 I get:
----------------------------- GEMM -------------------------------
   M    N    K ALPHA  LDA  LDB  BETA  LDC  TIME MFLOP SpUp  TEST
==== ==== ==== ===== ==== ==== ===== ==== ===== ===== ==== =====
1111 1111 1111   1.0 1111 1111   1.0 1111  0.18 15273.2 1.00 -----
1111 1111 1111   1.0 1111 1111   1.0 1111  0.17 16535.9 1.08 PASS
1112 1112 1112   1.0 1112 1112   1.0 1112  0.18 15690.5 1.00 -----
1112 1112 1112   1.0 1112 1112   1.0 1112  0.13 21819.7 1.39 PASS
1113 1113 1113   1.0 1113 1113   1.0 1113  0.18 15529.9 1.00 -----
1113 1113 1113   1.0 1113 1113   1.0 1113  0.16 16925.3 1.09 PASS
1114 1114 1114   1.0 1114 1114   1.0 1114  0.18 15416.4 1.00 -----
1114 1114 1114   1.0 1114 1114   1.0 1114  0.15 18921.6 1.23 PASS
1115 1115 1115   1.0 1115 1115   1.0 1115  0.18 15407.0 1.00 -----
1115 1115 1115   1.0 1115 1115   1.0 1115  0.17 16626.9 1.08 PASS
1116 1116 1116   1.0 1116 1116   1.0 1116  0.18 15645.1 1.00 -----
1116 1116 1116   1.0 1116 1116   1.0 1116  0.13 21574.5 1.38 PASS
1117 1117 1117   1.0 1117 1117   1.0 1117  0.18 15643.9 1.00 -----
1117 1117 1117   1.0 1117 1117   1.0 1117  0.17 16471.1 1.05 PASS
1118 1118 1118   1.0 1118 1118   1.0 1118  0.18 15601.0 1.00 -----
1118 1118 1118   1.0 1118 1118   1.0 1118  0.15 18806.1 1.21 PASS
1119 1119 1119   1.0 1119 1119   1.0 1119  0.18 15484.5 1.00 -----
1119 1119 1119   1.0 1119 1119   1.0 1119  0.17 16207.1 1.05 PASS
1120 1120 1120   1.0 1120 1120   1.0 1120  0.18 15941.8 1.00 -----
1120 1120 1120   1.0 1120 1120   1.0 1120  0.12 22770.7 1.43 PASS

Eigen is mostly close to 15.5k but atlas minimum is 16.5k and maximum 22.8k. This doesn't change much for large sizes, atlas is usually 46 % faster:

------------------------ GEMM ----------------------------------
   M    N    K ALPHA  LDA  LDB  BETA  LDC  TIME MFLOP SpUp  TEST
==== ==== ==== ===== ==== ==== ===== ==== ===== ===== ==== =====
1000 1000 1000   1.0 1000 1000   1.0 1000  0.12 16370.2 1.00 -----
1000 1000 1000   1.0 1000 1000   1.0 1000  0.09 22272.8 1.36 PASS
2000 2000 2000   1.0 2000 2000   1.0 2000  1.01 15918.3 1.00 -----
2000 2000 2000   1.0 2000 2000   1.0 2000  0.69 23089.3 1.45 PASS
3000 3000 3000   1.0 3000 3000   1.0 3000  3.42 15788.6 1.00 -----
3000 3000 3000   1.0 3000 3000   1.0 3000  2.33 23128.0 1.46 PASS
4000 4000 4000   1.0 4000 4000   1.0 4000  8.07 15861.7 1.00 -----
4000 4000 4000   1.0 4000 4000   1.0 4000  5.50 23266.5 1.47 PASS
5000 5000 5000   1.0 5000 5000   1.0 5000 15.70 15928.1 1.00 -----
5000 5000 5000   1.0 5000 5000   1.0 5000 10.74 23269.0 1.46 PASS
6000 6000 6000   1.0 6000 6000   1.0 6000 27.06 15963.7 1.00 -----
6000 6000 6000   1.0 6000 6000   1.0 6000 18.51 23336.2 1.46 PASS
7000 7000 7000   1.0 7000 7000   1.0 7000 43.06 15930.2 1.00 -----
7000 7000 7000   1.0 7000 7000   1.0 7000 29.31 23401.6 1.47 PASS
8000 8000 8000   1.0 8000 8000   1.0 8000 63.97 16007.7 1.00 -----
8000 8000 8000   1.0 8000 8000   1.0 8000 43.79 23381.7 1.46 PASS
9000 9000 9000   1.0 9000 9000   1.0 9000 91.77 15887.7 1.00 -----
9000 9000 9000   1.0 9000 9000   1.0 9000 62.44 23351.7 1.47 PASS

Ilja



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/