Re: [eigen] Blas performance on mapped matrices

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Blas performance on mapped matrices
From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
Date: Mon, 9 Jan 2012 17:09:42 +0100
Cc: Keir Mierle <keir@xxxxxxxxxx>
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=J6R10igXSiFi4XtbL4ORJaH5Qk+/251VqEcM883fsjc=; b=mErmOMNwxjxjmuIO1H/CnN+Ct9VoiTjomaw7LRWQVcBxTFT5CauiCVvwvrCZTs15Dl 50ivwyc1yJ3rqtchgitY5i6WbesyMKqWPUuFOqTsLeLXx3OPcmx5XpgZ+ytVjcdbswuH G8FGKkRkCTFv2D5z3PC8fxEIkOEeBI9q3siZ0=

I really cannot reproduce, on my system all the variants using Eigen3
are faster than the best I can get out of Eigen2 (I used double).

The result of my quick experiments also shown that;

A.block<size1, size2>(r, c).noalias() -= B * C;

is indeed the best you can do, and both the "noalias" and static sized
block are useful. It is in particular faster than:

A.block<size1, size2>(r, c) -= B.lazyProduct(C);

which uses an expression based product algorithm (tailored for very
small products).

gael.

-----------------------------------------

#include <iostream>
#include <Eigen/Dense>
#include <bench/BenchTimer.h>
using namespace Eigen;

typedef double Scalar;
typedef Matrix<Scalar,Dynamic,Dynamic, RowMajor> Mat;

EIGEN_DONT_INLINE void foo1(Scalar* dat1, Scalar* dat2, Mat& A, int i, int j)
{
  Block<Mat,9,9>(A,i,j).noalias() -= (Map< Matrix<Scalar,9,3,RowMajor>
>(dat1) * Map< Matrix<Scalar,3,9,RowMajor> >(dat2));
}

int main (int argc, char** argv)
{
  Matrix<Scalar,27,1> data1, data2;
  data1.setRandom();
  data2.setRandom();

  Mat A(100,100);

  BenchTimer t1;
  int tries = 10;
  int rep = 10000;


  BENCH(t1, tries, rep, foo1(data1.data(), data2.data(), A, 2,3););
  std::cerr << t1.best() << "s\n";


  return (0);
}


On Mon, Jan 9, 2012 at 2:40 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2012/1/9 Sameer Agarwal <sameeragarwal@xxxxxxxxxx>:
>> Hi Guys,
>> We are in the process of a significant code migration from eigen2 to
>> eigen3. The code uses Eigen::Map to map chunks of memory into RowMajor
>> matrices and operates on them. The primary operation is of the form
>>
>> A.block(r, c, size1, size2) -= B * C;
>>
>> A is a mapped matrix.
>> C is a mapped matrix.
>> B is an actual Eigen matrix.
>>
>> All matrices are RowMajor. For the example being considered, size1 =
>> size2 = 9. B is 9x3, and C is 3x9.
>> C and B are statically sized.
>>
>> Moving from eigen2 to eigen3 has resulting in a 30% performance
>> regression. Has something changed significantly in the way Eigen3
>> handles mapped matrices, or about the structure of matrix-matrix
>> multiplication in Eigen3 that would cause this?
>>
>> The compiler flags are all the same between our use of eigen2 and
>> eigen3. Profiling indicates that much of the time is being spent
>> inside Eigen::internal::gebp_kernel::operator.
>>
>> I understand that this is not sufficient information to reproduce this
>> problem, so I am going to try and create a minimal case which can
>> reproduce this performance regression. In the meanwhile any insight
>> into this would be useful.  Also is it possible to statically size
>> blocks like matrices?
>
> Yes, as explained on http://eigen.tuxfamily.org/dox/TutorialBlockOperations.html
> (also see Jitse's email, using that syntax).
>
> I agree with Jitse's suggestion of playing with .noalias() and with
> EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD, especially given your very
> special size where one of the two dimensions only is greater than the
> default threshold, it's very tempting to suspect that's the cause of
> your regression.
> Regarding noalias(),  see this page:
> http://eigen.tuxfamily.org/dox/TopicWritingEfficientProductExpression.html
> Cheers,
> Benoit
>
>>
>> Thank you,
>> Sameer
>>
>>
>
>

Follow-Ups:
- Re: [eigen] Blas performance on mapped matrices
  - From: Sameer Agarwal

References:
- [eigen] Blas performance on mapped matrices
  - From: Sameer Agarwal
- Re: [eigen] Blas performance on mapped matrices
  - From: Benoit Jacob

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Blas performance on mapped matrices
Next by Date: Re: [eigen] Blas performance on mapped matrices
Previous by thread: Re: [eigen] Blas performance on mapped matrices
Next by thread: Re: [eigen] Blas performance on mapped matrices

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/