[eigen] on extern template instanciation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

Hi all,

first of all let me make it clear that, by default, Eigen2 will stay a *pure template library*. However, as you can easily guess, compiling files making an heavy use of the advanced linear algebra modules such as LU, Cholesky or QR might take ages... Here comes the *optional* shared library which contains explicit intanciations of the heaviest pieces of code for the most common scalar/matrix types.

To use it, the user has to define EIGEN_EXTERN_INSTANCIATIONS before including any eigen's header and link to the Eigen2 lib.

So far it only contains the instanciations of two static functions: cache friendly product and the QR step algorithm. Now, I'd like to add explicit intanciations for QR, EigenSolver, Tridiagonalization etc...

To do so the solution is do add the explicit instanciations in a .cpp file that is part of the Eigen2 lib, for the QR module this looks like this:

  template class QR<MATRIXTYPE>; \
  template class Tridiagonalization<MATRIXTYPE>; \
  template class HessenbergDecomposition<MATRIXTYPE>; \
  template class SelfAdjointEigenSolver<MATRIXTYPE>; \
  template class EigenSolver<MATRIXTYPE>


Now we have all our heavy code in a lib we have to tell the compiler to not instanciate these classes for these matrix types. A common approach to do so is to declare these instanciations with the "extern" keyword:

extern template class QR<Matrix2f>;
extern template class QR<Matrix2d>;

in one of the header files. Even though such "extern template" declarations are not standard, most compilers support it and this solution still allows the user to use, let's say, EigenSolver<Matrix<ExactFloat> >  even though this type has not been explicitly instanciated in the lib. Now what happens in practice:  (i.e., what about compilation time)
- icc: 23s versus 7.5s  :)  (for test_qr)
- gcc: no improvement at all   :(

So it seems GCC does the instanciations anyway.... too bad ! (I also tried with the "inline" keyword instead of extern without any effect, see http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Template-Instantiation.html)

So, does anybody of you know anytrick trick to make GCC behaves as expected ?

If there is no good solution for GCC, then the workaround is to hide the heaviest code using either:
1 - separate header files
2 - put the code in between pairs of #ifndef EIGEN_EXTERN_INSTANCIATIONS / #endif

I tried the second option and then the compilation time with gcc drops down to 8sec (versus 16sec). Now what are the consequences for the user when she/he wants to use it's own matrix type:

- agin, if EIGEN_EXTERN_INSTANCIATIONS is not manually defined nothing it still a pure template lib !

- again, with the "extern" keyword nothing :) but this does not speedup gcc :(

- with option 1 the user will have to include an "implementation header", for instance :
  #include <Eigen/QR_Impl>
 ot whatever better name ! Typically, if EIGEN_EXTERN_INSTANCIATIONS is not defined, then the header Eigen/QR will automatically include Eigen/QR_Impl and Eigen/QR_Impl would include src/QR/QR_imp.h, src/QR/EigenSolver_impl.h, etc...

- with option 2 the user has to #undef EIGEN_EXTERN_INSTANCIATIONS before including any eigen 's header. Note that this also works with option 1.

So to summarize:
 - is it possible to make "extern template" working well with gcc ?
 - if not, then what is the best: option 1 or option 2 ?
- any better idea ?


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/