Re: [eigen] Re: SGEMM benchmark result against ATLAS

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


here are my today results (relative efficiency compared to theoretical
max peak performance)

Intel(R) Xeon(R) CPU E5540  @ 2.53GHz (iCore 7)

float   : 85%
double: 85%


Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz (second version of core2)

float: 88%
double: 78%


I used the exact same executables on both computer (compiled with gcc
4.5). I don't know why doubles are so slow on the latter since I don't
remember of such a behavior...

GCC 4.3 produces slightly slower code (~83% of the peak perf).

gael.

On Thu, Sep 2, 2010 at 2:34 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> (Francesco -- I forgot to CC you in the email I just sent about DGEMM.
> Just mentioning as you probably don't read every email in this
> list...)
>
> I just checked in a debugger, only 1 thread is used by ATLAS too (we
> already knew that for Eigen).
>
> I am linking with -lf77blas.
>
> Benoit
>
> 2010/8/24 Francesco Callari <fgcallari@xxxxxxxxx>:
>> Hi Benoit,
>> a few questions:
>> 1. Are you building your own ATLAS, or running a a prebuilt one?
>> 2. If building, could you please post the output of 'make time'? It's the
>> last step in the usual build sequence and  compares the speed ATLAS achieves
>> on your machine with the comparable one it was configured with-
>> 3. Are you running ATLAS single- or multi-threaded? Easy to see: if you
>> linked with libatlas.a it is single, if libptatlas.a it's multi.
>> 4. Could you also please time dgemm?
>> Thanks
>> Franco
>>
>> On Tue, Aug 24, 2010 at 11:07 AM, Keir Mierle <mierle@xxxxxxxxx> wrote:
>>>
>>> A question for Benoit: Is this running the threaded of eigen and atlas?
>>> Keir
>>>
>>> On Tue, Aug 24, 2010 at 10:52 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>>> wrote:
>>>>
>>>> I too have atlas 3.8.3, and am using gcc 4.4 on linux x86-64. So I
>>>> can't really conclude anything, sorry.
>>>> Benoit
>>>>
>>>> 2010/8/24 Francesco Callari <fgcallari@xxxxxxxxx>:
>>>> > Hmmm, I think this is the info I can share:
>>>> > ATLAS build configuration.
>>>> > ====================
>>>> > ATLAS v3.8.3
>>>> > GCC 4.<redacted>
>>>> > GLIBC 2.<redacted>
>>>> > Configuration flags: 64-bit build using the chosen gcc for everything
>>>> > compiler.
>>>> > cc=${TOP}/bin/gcc
>>>> > f77=${TOP}/bin/gfortran
>>>> > mhz=<redacted>
>>>> >
>>>> > ./configure \
>>>> >     -C xc ${cc} -C gc ${cc} -C ic ${cc} -C dm ${cc} -C sm ${cc} \
>>>> >     -C dk ${cc} -C sk ${cc} \
>>>> >     -C if ${f77} \
>>>> >     -b 64 \
>>>> >     -D c -DPentiumCPS=${mhz}
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Aug 24, 2010 at 10:39 AM, Franco Callari <fgc@xxxxxxxxxx>
>>>> > wrote:
>>>> >>
>>>> >>
>>>> >> ---------- Forwarded message ----------
>>>> >> From: Keir Mierle <mierle@xxxxxxxxx>
>>>> >> Date: Tue, Aug 24, 2010 at 1:19 AM
>>>> >> Subject: Fwd: SGEMM benchmark result against ATLAS
>>>> >>
>>>> >>
>>>> >> Hey, care to forward any info about how you configured ATLAS?
>>>> >>
>>>> >> ---------- Forwarded message ----------
>>>> >> From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>>>> >> Date: Mon, Aug 23, 2010 at 8:45 PM
>>>> >> Subject: SGEMM benchmark result against ATLAS
>>>> >> To: eigen <eigen@xxxxxxxxxxxxxxxxxxx>
>>>> >> Cc: Keir Mierle <mierle@xxxxxxxxx>, Gael Guennebaud
>>>> >> <gael.guennebaud@xxxxxxxxx>
>>>> >>
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> Hearing from Keir that he saw untuned ATLAS outperform us by a 30%
>>>> >> margin,
>>>> >> which would be very unusual, I ran our benchBlasGemm a bit. By the
>>>> >> way, I
>>>> >> updated it to make it compile, which involved removing the
>>>> >> eigen_..._normal
>>>> >> path which didn't look useful (?), hope it's OK. Also, it was missing
>>>> >> a
>>>> >> extern "C" around the cblas #include.
>>>> >>
>>>> >> So I installed the most optimized ATLAS package that I could on
>>>> >> Fedora,
>>>> >> built with SSE3.
>>>> >>
>>>> >> I compiled our benchmark with:
>>>> >>
>>>> >> cd eigen/bench/
>>>> >> g++ -O3 -msse3 -I.. -L /usr/lib64/atlas/ benchBlasGemm.cpp  -o
>>>> >> benchBlasGemm -lrt -lcblas
>>>> >>
>>>> >> And ran it on some 4096x4096 matrices:
>>>> >>
>>>> >> [bjacob@cahouette bench]$ ./benchBlasGemm 4096
>>>> >> 4096 x 4096 x 4096
>>>> >> cblas: 8.73982 (7.862 GFlops/s)
>>>> >> eigen : 8.9491 (7.678 GFlops/s)
>>>> >> [bjacob@cahouette bench]$ ./benchBlasGemm 4096
>>>> >> 4096 x 4096 x 4096
>>>> >> cblas: 8.51913 (8.066 GFlops/s)
>>>> >> eigen : 8.42922 (8.152 GFlops/s)
>>>> >>
>>>> >> So _my_ results show Eigen3 and ATLAS running at the same speed
>>>> >> roughly,
>>>> >> albeit with a great variability.
>>>> >>
>>>> >> This is still perplexing for 2 reasons:
>>>> >>  - we used to beat ATLAS by a wide margin.
>>>> >>  - the roughly 8 GFlops here are not too good. My CPU is a Core i7 at
>>>> >> 1.66
>>>> >> GHz. So x4 (because of float) and x2 (pipelining of addps and mulps)
>>>> >> we
>>>> >> should aim at 13.33 GFlops. So we are running here at only 60% of the
>>>> >> theoretical maximum; I think we used to do much better than that.
>>>> >>
>>>> >> So let me ask Gael and Keir:
>>>> >> * Keir: what do you get on this benchmark? How did you get this result
>>>> >> where ATLAS outperformed us by 30%?
>>>> >> * Gael: suppose I want to get deeper into this, where do I start?
>>>> >>
>>>> >> Cheers,
>>>> >> Benoit
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Francesco Callari <fgc@xxxxxxxxxx>
>>>> >>
>>>> >>             EC67 BEBE 62AC 8415 7591  2B12 A6CD D5EE D8CB D0ED
>>>> >>
>>>> >> Violence is the last refuge of the incompetent  (I. Asimov)
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Franco Callari <fgcallari@xxxxxxxxx>
>>>> >
>>>> >             EC67 BEBE 62AC 8415 7591  2B12 A6CD D5EE D8CB D0ED
>>>> >
>>>> > I am not bound to win, but I am bound to be true. I am not bound to
>>>> > succeed,
>>>> > but I am bound to live by the light that I have. (Abraham Lincoln)
>>>> >
>>>
>>
>>
>>
>> --
>> Franco Callari <fgcallari@xxxxxxxxx>
>>
>>             EC67 BEBE 62AC 8415 7591  2B12 A6CD D5EE D8CB D0ED
>>
>> I am not bound to win, but I am bound to be true. I am not bound to succeed,
>> but I am bound to live by the light that I have. (Abraham Lincoln)
>>
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/