[eigen] Architecture specific performance optimizations

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

Hi all,

Gael initiated a discussion on this bugzilla-entry, which I think deserves a more general discussion on the list:
The problem is that change kills the performance on older architectures for
which pset1 is very slow. I'm also unsure about ARM/Neon.  So this trigger a
more general question about the policy we should adopt regarding optimizations:

a) shall we limit the number of variants and favor modern architectures?

b) shall we keep multiple variants and detect the best one at compile time
through compiler's preprocessors or user defined preprocessor token?

IMO, on the long run, we definitely should focus on (mainstream) up-to-date hardware, but still at least keep compatibility to out-dated hardware (which we currently do when disabling vectorization entirely). Also we should ensure to not make too big performance regressions on the not-so-cutting-edge hardware.

I think as a first step, we should have some kind of survey on what platforms Eigen currently is used, who requires to have top-performance (I would assume this correlates with people having up-to-date hardware). Maybe a more important issue is to find people willing to run performance tests on a regular basis (especially for both out-dated and cutting-edge hardware).


Dipl.-Inf., Dipl.-Math. Christoph Hertzberg
Cartesium 0.049
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen

Tel: +49 (421) 218-64252

Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/