Re: [eigen] Eigen 2 to Eigen 3 performance regressions with mapped matrices

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi Gael,
The numbers I sent out in my follow up email does indeed use 1 instead of Dynamic. 

You are right about the lazyProduct. It completely changes the results.

GEMV
       1     2     3     4     5     6     7     8     9    10
    ------------------------------------------------------------
  1|  0.80  1.00  0.86  0.86  1.00  1.00  0.90  0.90  1.00  1.00
  2|  0.83  1.00  0.89  0.91  0.92  0.93  0.94  0.94  0.95  1.00
  3|  0.86  1.00  1.00  0.92  1.00  0.94  1.06  1.00  1.00  1.00
  4|  0.88  0.91  0.92  0.94  0.95  0.95  0.92  0.96  1.00  0.97
  5|  0.88  1.00  1.00  0.94  0.95  0.95  1.00  1.00  1.03  1.03
  6|  0.89  0.93  0.94  0.95  0.92  0.96  0.97  1.00  1.00  1.05
  7|  1.00  0.93  0.94  0.95  1.00  1.00  1.00  0.97  0.98  1.00
  8|  0.91  0.94  0.95  0.96  0.97  0.94  1.00  0.98  0.92  1.00
  9|  0.91  1.00  1.00  1.00  0.97  0.97  0.93  0.92  0.98  1.08
 10|  1.00  0.95  1.00  1.00  0.97  0.98  0.85  1.06  1.04  1.07


GEMM
       1     2     3     4     5     6     7     8     9    10
    ------------------------------------------------------------
  1|  0.60  0.64  0.73  0.75  0.83  0.77  0.79  0.86  0..87  0.88
  2|  0.58  0.69  0.75  0.81  0.82  0.86  0.87  0.88  0.89  0.90
  3|  0.62  0.72  0.81  0.85  0.88  0.89  0.95  0.97  0..96  0.96
  4|  0.60  0.75  0.83  0.87  0.90  0.93  0.95  0.97  0.97  0.97
  5|  0.68  0.82  0.89  0.93  0.95  0.98  0.99  1.01  0..93  1.12
  6|  0.67  0.81  0.88  0.91  0.95  0.97  0.99  1.02  1.01  1.00
  7|  0.73  0.88  0.93  0.96  0.97  1.01  1.03  1.03  1..03  1.17
  8|  0.72  0.87  0.93  0.95  0.99  1.00  1.02  1.05  1.04  1.04
  9|  0.78  0.91  0.94  0.98  1.00  0.96  1.06  0.98  1..05  1.16
 10|  0.78  0.89  0.94  0.98  0.99  0.98  1.05  1.00  1.05  1.04

In our other benchmarks, we are now within a few percent of eigen2's performance, which is good enough for us. We work with block sparse matrices in our work, where the blocks are typically in the 2-10 range. So Eigen's performance on small matrices is of great interest to us.

Thanks,
Sameer


On Wed, Jan 11, 2012 at 11:42 PM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:
Hi,

well first you should really use 1 instead of Dynamic for the vectors
such that gemv like operations are called (instead of gemm like).

Then, the main difference with Eigen2, is that we don't check anymore
the sizes at runtime to fallback to a naive product implementation if
the objects are too small. Again, you can still enforce the naive
product with .lazyProduct if you know that's best for you.

That said, I still plan to add such runtime tests to pick the right
algorithm. I think there is still room for designing even better
product algorithms for such small matrices and vectors. However I
observed the performance of a "naive" product algorithm depends a lot
on the architecture and compiler for small objects, so the choice of
the thresholds is rather difficult.

I'll add an entry in our bug tracker.

gael

On Wed, Jan 11, 2012 at 4:58 AM, Keir Mierle <mierle@xxxxxxxxx> wrote:
> I've attached a microbenchmark that is similar in spirit to what we are
> doing with Eigen, that illustrates slowdown from Eigen 2 to Eigen 3. In
> particular, the benchmark does y += A*x, for A, x, y mapped unaligned
> dynamic but small dimension matrices. It could be that I have not chosen
> appropriate compiler flags. I am seeing performance 2x to 3x worse. Take a
> look at the header comments in the attached benchmark for more numbers..
>
> Keir



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/