Re: [eigen] Architecture specific performance optimizations |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
On 18.03.2014 04:16, Benoit Steiner wrote:
Christoph, regarding bug
721<http://eigen.tuxfamily.org/bz/show_bug.cgi?id=721#c4> is
looks like the compiler version has a lot more impact on the actual
performance of the code than the actual instruction set (at least for x86
cpus). Maybe in addition to focus on recent hardware we should also focus
our efforts on recent versions of the compilers to reduce the amount of
validation work needed to vet improvements?
Yes, I'm following your discussion on bugzilla. That's indeed bad, at
least it is even to detect but I guess it gets very hard to work-around
these compiler oddities.
Have you tried if using -mtune=native or -march=native parameters helps
avoiding this? Maybe newer GCC versions try to optimize for not yet
mainstream CPUs? Though I'm losing more and more confidence that
compilers are good at low-level optimizing ...
Furthermore, do you know this nice tool:
http://software.intel.com/en-us/articles/intel-architecture-code-analyzer
Unfortunately, it only analyzes architectures starting from Nehalem (and
of course no AMD architectures ...) but it very nicely shows throughput
and possible bottlenecks.
It also shows that, e.g., between Nehalem and Haswell there can be a big
throughput difference for the very same machine instructions.
Christoph
--
----------------------------------------------
Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen
Tel: +49 (421) 218-64252
----------------------------------------------