|[eigen] Blas performance on mapped matrices|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: [eigen] Blas performance on mapped matrices
- From: Sameer Agarwal <sameeragarwal@xxxxxxxxxx>
- Date: Sun, 8 Jan 2012 23:47:14 -0800
- Cc: Keir Mierle <keir@xxxxxxxxxx>
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:x-system-of-record :content-type; bh=1gXWu2YQJOdKK1pejB4FNsYv0+yMwCgvymF5S6mDzCU=; b=SvvjrNnt/4d2gISz4IKYsGzXItfMYhUPW2BK8v6b4L8ENJzU2GqWL2wS5DYlIILT2p USV6pgn37xkrBTJV5Nh6jqdT3wEmc/g6lEAa1U0Fy+LjOIiOANEtMaCl7ZBzJZ8FWbEl JYpm1Ld1GqKp6ESFnui5pd0P9z86J/QFaFvsw=
We are in the process of a significant code migration from eigen2 to
eigen3. The code uses Eigen::Map to map chunks of memory into RowMajor
matrices and operates on them. The primary operation is of the form
A.block(r, c, size1, size2) -= B * C;
A is a mapped matrix.
C is a mapped matrix.
B is an actual Eigen matrix.
All matrices are RowMajor. For the example being considered, size1 =
size2 = 9. B is 9x3, and C is 3x9.
C and B are statically sized.
Moving from eigen2 to eigen3 has resulting in a 30% performance
regression. Has something changed significantly in the way Eigen3
handles mapped matrices, or about the structure of matrix-matrix
multiplication in Eigen3 that would cause this?
The compiler flags are all the same between our use of eigen2 and
eigen3. Profiling indicates that much of the time is being spent
I understand that this is not sufficient information to reproduce this
problem, so I am going to try and create a minimal case which can
reproduce this performance regression. In the meanwhile any insight
into this would be useful. Also is it possible to statically size
blocks like matrices?