Re: [eigen] patch to add ACML support to BTL

[ Thread Index | Date Index | More Archives ]

On Tue, Mar 17, 2009 at 9:20 AM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
> I think that with new library versions, new eigen versions, and new
> gcc we should put these results on the main benchmark page of eigen
> website. BTW I think the eigen's performance has slipped considerably
> when I look at your Pentium D benchmarks, or it's all attributable to
> core2 being a much better cpu?

thanks for the benchs,

core2 is indeed much better than a Pentium D, and since I only have a
core2, the critical parts (matrix-matrix products) are only fine tuned
for the core2. Another reason is that gcc 4.3 generates slower code
than 4.2: some constant expressions are not removed out the inner
loops, it is not optimal with block expressions, and by default 4.3
automatically generates vectorized code which conflicts with Eigen's
automatic vectorization. 4.4 do not suffer from all these issues, and
sometimes, gcc 4.4 auto-vec is even better than Eigen's explicit one
because it better understands what it is doing: an example is rank-2
update which simply consists in a series "v += ax + by" ops. But
Eigen's explicit vec is still worth it because we are able to
vectorize much more cases than gcc. Examples: "v = ax + by" is not
vectorized by gcc, matrix products, vectorization + explicit
unrolling, in the future sin, cos, pow, exp, etc.


> On Tue, Mar 17, 2009 at 1:08 PM, Victor <flyaway1212@xxxxxxxxx> wrote:
>> Hi all.
>> It sure took a while to run all the benchmarks with all the libraries
>> available to me... I wish I had read the instructions more carefully and
>> hadn't wasted any time testing multithreaded libraries...
>> Anyways, the results are on the wiki:
>> Gael Guennebaud wrote:
>>> Hi Victor,
>>> thanks a lot for the patch.
>>> applied in rev 935462, the syr2 header will follow in a second.
>>> so what's your conclusion, is ACML as good as MKL ?
>> Unfortunately, no. ACML is not bad though. It's hard to say once and for
>> all, but most of the time MKL beats ACML. Even on an AMD CPU MKL is
>> typically better. ACML shows decent performance (even on Intel CPU), on
>> average similar to ATLAS, but again results differ from test to test.
>> The good thing about ACML (and MKL, Goto and ATLAS) is that they can be
>> used in multithreading mode, which unfortunately can't be demonstrated
>> with BTL as far as I can tell.
>> Also, it looks like in comparison with other libs Eigen does better on
>> Intel than on AMD.
>> Out of curiosity, I have also run BTL with Eigen compiled with 4
>> different compilers. Well, 3 different gcc versions and intel c++. See
>> the results here
>> I hope this might be useful to somebody.
>> Cheers,
>> Victor.
> --
> Rohit Garg
> Senior Undergraduate
> Department of Physics
> Indian Institute of Technology
> Bombay

Mail converted by MHonArc 2.6.19+