Re: [eigen] again msvc inlining... |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] again msvc inlining...
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Fri, 19 Mar 2010 10:55:02 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=rObAuJEiLdqQOjjXz56EIMcFuWuxug8aLg6K8dtfuhs=; b=Eyguow61ur9W/l5mF0B0pVIzIlZv0MHstBrF0hpz1rVhVHu42g2pHj1sFRQoNYLWTs 8zfLacBRrdAqHZwyozY898WZZt35l+By7GRfKvd5brHT0Jyz/5QqxtNGfiZwnW0MwnEk S1DMDHx9yR0U9rFbwsRhBFWNwYA9PwVQU92Jk=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=HQku8Jd4hzJMlAJy7jNM/0XcutMrVyql25UVir//kREx90je25nFB8pGpShq4eDzFA i/0SXZM/fipGBQXN4INsgZh5woSjCCZTNSlh4Du9rpljZYHckXhGWHQNZuSP9W4sWfME RJPSYbEnhgINoeBzMTwCVJe6Zkw69ta/Bm+3o=
2010/3/19 Eamon Nerbonne <eamon.nerbonne@xxxxxxxxx>:
> GCC: 4.4.3 (the 64-bit build at equation.com)
> MSC: VS.NET 2010 RC
>
> I'll gladly send a patch, once I've figured out how to generate a clean one;
> I'm still learning the hg ropes (right now it's a mess of lots of
> revisions+merges).
The command here is: hg export
hg export revision(s) > patch_file
>
> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>
>
> On Fri, Mar 19, 2010 at 09:32, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>
> wrote:
>>
>> Hi Eamon,
>>
>> these findings sound interesting. Can you show me a patch containing
>> you changes such that I can get at least an idea of which parts affect
>> your run-times!?
>>
>> I just want to clarify once more - the methods I changed at the moment
>> are no low-level function calls. They include only those functions
>> that every now and then do return heap objects which prevents inlining
>> even in the case where you explicitly ask for forced inlines on MSVC.
>>
>> A final question - which compiler version are you using?
>>
>> - Hauke
>>
>> On Thu, Mar 18, 2010 at 3:27 PM, Eamon Nerbonne
>> <eamon.nerbonne@xxxxxxxxx> wrote:
>> > OK, I played around with inlining on a "real" program (it's an
>> > implementation of learning vector quantization).
>> >
>> > These are the results:
>> >
>> > Before any updates:
>> > LvqBench2 on GCC: 1.93s; 766KB
>> > LvqBench2v on GCC: 1.69s; 770KB
>> > LvqBench3 on GCC: 1.88s; 774KB
>> > LvqBench3v on GCC: 1.72s; 779KB
>> > LvqBench2 on MSC: 2.02s; 124KB
>> > LvqBench2v on MSC: 1.24s; 129KB
>> > LvqBench3 on MSC: 1.64s; 131KB
>> > LvqBench3v on MSC: 0.993s; 138KB
>> >
>> > Post-patch; EIGEN_MORE_INLINE off
>> > LvqBench2 on GCC: 1.93s; 766KB
>> > LvqBench2v on GCC: 1.69s; 770KB
>> > LvqBench3 on GCC: 1.89s; 777KB
>> > LvqBench3v on GCC: 1.7s; 778KB
>> > LvqBench2 on MSC: 2.02s; 124KB
>> > LvqBench2v on MSC: 1.24s; 129KB
>> > LvqBench3 on MSC: 1.63s; 131KB
>> > LvqBench3v on MSC: 0.988s; 141KB
>> >
>> > Post-patch; EIGEN_MORE_INLINE on
>> > LvqBench2 on GCC: 1.92s; 766KB
>> > LvqBench2v on GCC: 1.69s; 770KB
>> > LvqBench3 on GCC: 1.72s; 777KB
>> > LvqBench3v on GCC: 1.55s; 782KB
>> > LvqBench2 on MSC: 2.01s; 124KB
>> > LvqBench2v on MSC: 1.25s; 129KB
>> > LvqBench3 on MSC: 1.16s; 138KB
>> > LvqBench3v on MSC: 0.937s; 151KB
>> >
>> > The 2/2v/3/3v suffix corresponds to the version of eigen and whether
>> > vectorization is on. The timings are best of 10 runs. Although the
>> > eigen2
>> > variants weren't changed, I left in their timings to give a feel for the
>> > variance of the timings.
>> >
>> > The update removed a few EIGEN_DONT_INLINE's and added a few inlines.
>> > When
>> > EIGEN_MORE_INLINE is on, those extra inlines are instead strong inlines,
>> > and
>> > strong inlines also get the EIGEN_ALWAYS_INLINE_ATTRIB on gcc.
>> >
>> > Most interesting are the 3v timings:
>> > GCC 1.72s; 779KB changes to 1.55s; 782KB.
>> > MSC 0.993s; 138KB changes to 0.937s; 151KB
>> >
>> > which, for this application anyhow, is an obvious improvement.
>> >
>> > With that said - not all strong inlines are useful; I initially just
>> > added
>> > eigen_strong_inline everywhere and that's causes excessive compile times
>> > and
>> > larger executables. So, on the second attempt I tried to add inlines
>> > where
>> > functions were otherwise cheap, particularly when a call to an
>> > eigen-function was implemented with another eigen function (i.e. where
>> > the
>> > eigen-internal call stack was more than 1 deep), or where several
>> > versions
>> > of an algorithm differed merely by template arguments and a few versions
>> > already had EIGEN_STRONG_INLINE.
>> >
>> > Also noteworthy is the relatively poor GCC performance; I'm not sure
>> > what's
>> > going on there. Most of my micro-benchmarks end up with GCC in a very
>> > solid
>> > lead, but here it's slower. I tried using gprof (which is available
>> > under
>> > windows), but the resultant executable immediate crashed with bad_alloc.
>> >
>> > --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>
>>
>
>