Re: [eigen] again msvc inlining...

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


OK, I played around with inlining on a "real" program (it's an implementation of learning vector quantization).

These are the results:

Before any updates:
LvqBench2 on GCC: 1.93s; 766KB
LvqBench2v on GCC: 1.69s; 770KB
LvqBench3 on GCC: 1.88s; 774KB
LvqBench3v on GCC: 1.72s; 779KB
LvqBench2 on MSC: 2.02s; 124KB
LvqBench2v on MSC: 1.24s; 129KB
LvqBench3 on MSC: 1.64s; 131KB
LvqBench3v on MSC: 0.993s; 138KB

Post-patch; EIGEN_MORE_INLINE off
LvqBench2 on GCC: 1.93s; 766KB
LvqBench2v on GCC: 1.69s; 770KB
LvqBench3 on GCC: 1.89s; 777KB
LvqBench3v on GCC: 1.7s; 778KB
LvqBench2 on MSC: 2.02s; 124KB
LvqBench2v on MSC: 1.24s; 129KB
LvqBench3 on MSC: 1.63s; 131KB
LvqBench3v on MSC: 0.988s; 141KB
 
Post-patch; EIGEN_MORE_INLINE on
LvqBench2 on GCC: 1.92s; 766KB
LvqBench2v on GCC: 1.69s; 770KB
LvqBench3 on GCC: 1.72s; 777KB
LvqBench3v on GCC: 1.55s; 782KB
LvqBench2 on MSC: 2.01s; 124KB
LvqBench2v on MSC: 1.25s; 129KB
LvqBench3 on MSC: 1.16s; 138KB
LvqBench3v on MSC: 0.937s; 151KB

The 2/2v/3/3v suffix corresponds to the version of eigen and whether vectorization is on.  The timings are best of 10 runs.  Although the eigen2 variants weren't changed, I left in their timings to give a feel for the variance of the timings.

The update removed a few EIGEN_DONT_INLINE's and added a few inlines.  When EIGEN_MORE_INLINE is on, those extra inlines are instead strong inlines, and strong inlines also get the EIGEN_ALWAYS_INLINE_ATTRIB on gcc.

Most interesting are the 3v timings:
GCC 1.72s; 779KB changes to 1.55s; 782KB.
MSC 0.993s; 138KB changes to 0.937s; 151KB

which, for this application anyhow, is an obvious improvement.

With that said - not all strong inlines are useful; I initially just added eigen_strong_inline everywhere and that's causes excessive compile times and larger executables.  So, on the second attempt I tried to add inlines where functions were otherwise cheap, particularly when a call to an eigen-function was implemented with another eigen function (i.e. where the eigen-internal call stack was more than 1 deep), or where several versions of an algorithm differed merely by template arguments and a few versions already had EIGEN_STRONG_INLINE.

Also noteworthy is the relatively poor GCC performance; I'm not sure what's going on there.  Most of my micro-benchmarks end up with GCC in a very solid lead, but here it's slower.  I tried using gprof (which is available under windows), but the resultant executable immediate crashed with bad_alloc.

--eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163


On Sun, Mar 14, 2010 at 13:05, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx> wrote:
On Sun, Mar 14, 2010 at 12:37 PM, Hauke Heibel
> You are not seeing the effect in your program since this is unit test
> related. If you define ei_assert as it is done in main.h and if you
> enable internal debugging, you will see the same effect on MSVC --
> guaranteed. It's a hard-coded compiler heuristic.

I just recognized that you replied to removing the inline keyword in
general - forget about what I wrote here. ;)

- Hauke





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/