|Re: [eigen] Blas performance on mapped matrices|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Blas performance on mapped matrices
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Mon, 9 Jan 2012 08:40:58 -0500
- Cc: Keir Mierle <keir@xxxxxxxxxx>
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=JO/jOA7J8X1jhZKIZ9EejMLPC0TIeJgtWjvmsf/SZDE=; b=dXmKpLZ9Ljh04gyFa+61QTe4nJFTVJxEzZgIzTJ7MrzERJCxHQMLypfjip9r6HJFSh xQrzJHEM3TjGpXpFNHdCxDrOO4H1SDj6MB+F73rwxPkGqAuaTmyKhYYRtapo1S5MEokI 6OanAGIwVvebu5ebJe8SHeZXQWVzAHKPnTFto=
2012/1/9 Sameer Agarwal <sameeragarwal@xxxxxxxxxx>:
> Hi Guys,
> We are in the process of a significant code migration from eigen2 to
> eigen3. The code uses Eigen::Map to map chunks of memory into RowMajor
> matrices and operates on them. The primary operation is of the form
> A.block(r, c, size1, size2) -= B * C;
> A is a mapped matrix.
> C is a mapped matrix.
> B is an actual Eigen matrix.
> All matrices are RowMajor. For the example being considered, size1 =
> size2 = 9. B is 9x3, and C is 3x9.
> C and B are statically sized.
> Moving from eigen2 to eigen3 has resulting in a 30% performance
> regression. Has something changed significantly in the way Eigen3
> handles mapped matrices, or about the structure of matrix-matrix
> multiplication in Eigen3 that would cause this?
> The compiler flags are all the same between our use of eigen2 and
> eigen3. Profiling indicates that much of the time is being spent
> inside Eigen::internal::gebp_kernel::operator.
> I understand that this is not sufficient information to reproduce this
> problem, so I am going to try and create a minimal case which can
> reproduce this performance regression. In the meanwhile any insight
> into this would be useful. Also is it possible to statically size
> blocks like matrices?
Yes, as explained on http://eigen.tuxfamily.org/dox/TutorialBlockOperations..html
(also see Jitse's email, using that syntax).
I agree with Jitse's suggestion of playing with .noalias() and with
EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD, especially given your very
special size where one of the two dimensions only is greater than the
default threshold, it's very tempting to suspect that's the cause of
Regarding noalias(), see this page:
> Thank you,