Re: [eigen] benchmarks for large matrices?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] benchmarks for large matrices?
From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
Date: Wed, 18 Feb 2009 16:56:17 +0100
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=vpCzcdxYYXkUxrzwBdI4Lt4i8bVejr6Lu+5wnE8U/cU=; b=IvmHat7eXWGIV7vFYlTIG+sLNvayZGYDpKyhvgOj7nAq2oyiWuYbbXgsn/wMWMJWD+ LbvCWyCYZ/paP85h7lDEU/WQSW2kxg3XCcyQCrUws8y0esM9kXSK06jHQyPxEwBhm2ft l/Gh7V2UyEN5ID/TDsjbZRPLnrVgaxZDqiIFM=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=bocAqctjnbo6Y+mdDwgEmZbxuwyPryDP2E/AGcdgV/GLqY4QR+LniSo9Y74WW/6YJm OKhQkrcOUWEulseLxPU8eC044Kuft5tLlx6UpkRDQ7XaYuGvNkJjLe94ZBsfBbTYNTgT JbBEJTjHhAmr0VwZ2ILed+4Smy3+gKKWgE2Ak=

yep, actually I've just tried to compile the latest ATLAS myself even
though it seems to be a bit faster than the older one I used for the
benchmark, Eigen is still faster, especially for non multiple of 4
matrix sizes.

I attached a small benchmark that you can easily try:

compilation:

g++ -O2 -ffast-math -DNDEBUG gemm.cpp -latlas -lcblas -o gemm

then:


time ./gemm

and I get:

eigen: 0.79 s
ATLAS: 1.28s
MKL: 0.44

In each case I used a single thread, my CPU is:
Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz

so the peak performance is 21 GFLOPS, MKL reach ~18.2GFLOPS, eigen
~10.2, and Atlas ~6.25



On Wed, Feb 18, 2009 at 4:37 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2009/2/18 David Roundy <daveroundy@xxxxxxxxx>:
>> If you're using an ATLAS tuned for a
>> machine with a larger cache, it'd be no surprise that you'd get poor
>> numbers...
>
> I wouldn't expect that, because Gael's CPU is a Core 2 duo T7200, and
> those have 4 MB of cache.
>
> Benoit
>
>
>

#include "Eigen/Array"
using namespace Eigen;

extern "C" {
#include <cblas.h>

void sgemm_(const char *transa, const char *transb, const int *m, const int *n, const int *k,
           const float *alpha, const float *a, const int *lda, const float *b, const int *ldb,
           const float *beta, float *c, const int *ldc);

}

EIGEN_DONT_INLINE void eigenprod(const MatrixXf& a, const MatrixXf& b, MatrixXf& c)
{
  c += a * b;
}

EIGEN_DONT_INLINE void blasprod(const MatrixXf& a, const MatrixXf& b, MatrixXf& c)
{
  static const float fone = 1;
  static const float fzero = 0;
  static const char notrans = 'N';
  static const char trans = 'T';
  static const char nonunit = 'N';
  static const char lower = 'L';
  static const int intone = 1;

  int N = a.rows();
  cblas_sgemm(CblasColMajor,CblasNoTrans,CblasNoTrans,N,N,N,1.0,a.data(),N,b.data(),N,0.0,c.data(),N);
  //sgemm_(&notrans,&notrans,&N,&N,&N,&fone,a.data(),&N,b.data(),&N,&fzero,c.data(),&N);
}

int main(int argc, char ** argv)
{
  MatrixXf a = MatrixXf::Ones(1257,1257);
  MatrixXf b = MatrixXf::Ones(1257,1257);
  MatrixXf c = MatrixXf::Ones(1257,1257);
  
  for (int k=0; k<2; ++k)
  {
    blasprod(a,b,c);
    //eigenprod(a,b,c);
  }
  return 0;
}

Follow-Ups:
- Re: [eigen] benchmarks for large matrices?
  - From: David Roundy

References:
- [eigen] benchmarks for large matrices?
  - From: David Roundy
- Re: [eigen] benchmarks for large matrices?
  - From: Benoit Jacob
- Re: [eigen] benchmarks for large matrices?
  - From: Gael Guennebaud
- Re: [eigen] benchmarks for large matrices?
  - From: Benoit Jacob
- Re: [eigen] benchmarks for large matrices?
  - From: David Roundy
- Re: [eigen] benchmarks for large matrices?
  - From: Benoit Jacob

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] ASCII quick reference for Eigen2
Next by Date: Re: [eigen] ASCII quick reference for Eigen2
Previous by thread: Re: [eigen] benchmarks for large matrices?
Next by thread: Re: [eigen] benchmarks for large matrices?

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/