Re: [eigen] on extern template instanciation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Fri, Jun 13, 2008 at 5:52 PM, Benoît Jacob <jacob@xxxxxxxxxxxxxxx> wrote:
Hi,

extern template is going to be part of C++0x and GCC claims to already have
experimental support for it as of 4.3: 


http://gcc.gnu.org/gcc-4.3/cxx0x_status.html

actually GCC implements extern template for a while, gcc3 supports it, like MSVC. But that's good to know it's going to be a real standard :)

So one can bet that support for them will improve; and in the meanwhile the
best you could do would be to talk on gcc-help@ or file a bug report
(depending on how much you're confident that's a gcc bug); but definitely not
implement any workaround for something that'll work sooner or later and
already works in ICC!

yeah, I searched over the gcc ML archives but no luck, so I'll ask on gcc-help..
 

By the way, great news regarding the drop of compilation time due to extern
instantiations! What puzzles me is, since the lib compiles very fast, how can
it lead to such speed improvements? How can the speed gain on a single file
like test_qr be larger than the total compilation time of the library?

hm... with the explicit instanciation of all classes of the QR modules and 8 matrix types the compilation time of the lib is much much higher: ~1min for ~400KB (release mode). That's not huge at all if we compare to other linear algebra package (e.g., Intel's one, but this is true for many others) which requires a couple of MB only for the matrix-matrix products (abusive unrolling).

Gael.


On Fri, Jun 13, 2008 at 5:52 PM, Benoît Jacob <jacob@xxxxxxxxxxxxxxx> wrote:
Hi,

extern template is going to be part of C++0x and GCC claims to already have
experimental support for it as of 4.3:

http://gcc.gnu.org/gcc-4.3/cxx0x_status.html

So one can bet that support for them will improve; and in the meanwhile the
best you could do would be to talk on gcc-help@ or file a bug report
(depending on how much you're confident that's a gcc bug); but definitely not
implement any workaround for something that'll work sooner or later and
already works in ICC!

By the way, great news regarding the drop of compilation time due to extern
instantiations! What puzzles me is, since the lib compiles very fast, how can
it lead to such speed improvements? How can the speed gain on a single file
like test_qr be larger than the total compilation time of the library?

Me, I'm looking at rationalizing the huge machine that Assign.h has become.
Slowly understanding things about vectorization, reading your code.

Cheers,
Benoit



On Friday 13 June 2008 15:26:52 Gael Guennebaud wrote:
> Hi all,
>
> first of all let me make it clear that, by default, Eigen2 will stay a
> *pure template library*. However, as you can easily guess, compiling files
> making an heavy use of the advanced linear algebra modules such as LU,
> Cholesky or QR might take ages... Here comes the *optional* shared library
> which contains explicit intanciations of the heaviest pieces of code for
> the most common scalar/matrix types.
>
> To use it, the user has to define EIGEN_EXTERN_INSTANCIATIONS before
> including any eigen's header and link to the Eigen2 lib.
>
> So far it only contains the instanciations of two static functions: cache
> friendly product and the QR step algorithm. Now, I'd like to add explicit
> intanciations for QR, EigenSolver, Tridiagonalization etc...
>
> To do so the solution is do add the explicit instanciations in a .cpp file
> that is part of the Eigen2 lib, for the QR module this looks like this:
>
> #define EIGEN_QR_MODULE_INSTANCIATE(MATRIXTYPE) \
>   template class QR<MATRIXTYPE>; \
>   template class Tridiagonalization<MATRIXTYPE>; \
>   template class HessenbergDecomposition<MATRIXTYPE>; \
>   template class SelfAdjointEigenSolver<MATRIXTYPE>; \
>   template class EigenSolver<MATRIXTYPE>
>
> EIGEN_QR_MODULE_INSTANCIATE(Matrix2f);
> EIGEN_QR_MODULE_INSTANCIATE(Matrix2d);
> EIGEN_QR_MODULE_INSTANCIATE(Matrix3f);
> EIGEN_QR_MODULE_INSTANCIATE(Matrix3d);
> EIGEN_QR_MODULE_INSTANCIATE(Matrix4f);
> EIGEN_QR_MODULE_INSTANCIATE(Matrix4d);
> EIGEN_QR_MODULE_INSTANCIATE(MatrixXf);
> EIGEN_QR_MODULE_INSTANCIATE(MatrixXd);
>
> Now we have all our heavy code in a lib we have to tell the compiler to not
> instanciate these classes for these matrix types. A common approach to do
> so is to declare these instanciations with the "extern" keyword:
>
> extern template class QR<Matrix2f>;
> extern template class QR<Matrix2d>;
> etc...
>
> in one of the header files. Even though such "extern template" declarations
> are not standard, most compilers support it and this solution still allows
> the user to use, let's say, EigenSolver<Matrix<ExactFloat> >  even though
> this type has not been explicitly instanciated in the lib. Now what happens
> in practice:  (i.e., what about compilation time)
> - icc: 23s versus 7.5s  :)  (for test_qr)
> - gcc: no improvement at all   :(
>
> So it seems GCC does the instanciations anyway.... too bad ! (I also tried
> with the "inline" keyword instead of extern without any effect, see
> http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Template-Instantiation.html)
>
> So, does anybody of you know anytrick trick to make GCC behaves as expected
> ?
>
> If there is no good solution for GCC, then the workaround is to hide the
> heaviest code using either:
> 1 - separate header files
> 2 - put the code in between pairs of #ifndef EIGEN_EXTERN_INSTANCIATIONS /
> #endif
>
> I tried the second option and then the compilation time with gcc drops down
> to 8sec (versus 16sec). Now what are the consequences for the user when
> she/he wants to use it's own matrix type:
>
> - agin, if EIGEN_EXTERN_INSTANCIATIONS is not manually defined nothing it
> still a pure template lib !
>
> - again, with the "extern" keyword nothing :) but this does not speedup gcc
>
> :(
>
> - with option 1 the user will have to include an "implementation header",
> for instance :
>   #include <Eigen/QR_Impl>
>  ot whatever better name ! Typically, if EIGEN_EXTERN_INSTANCIATIONS is not
> defined, then the header Eigen/QR will automatically include Eigen/QR_Impl
> and Eigen/QR_Impl would include src/QR/QR_imp.h, src/QR/EigenSolver_impl.h,
> etc...
>
> - with option 2 the user has to #undef EIGEN_EXTERN_INSTANCIATIONS before
> including any eigen 's header. Note that this also works with option 1.
>
>
> So to summarize:
>  - is it possible to make "extern template" working well with gcc ?
>  - if not, then what is the best: option 1 or option 2 ?
> - any better idea ?
>
> cheers,
> Gael.





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/