|Re: [eigen] Performance gap between gcc and msvc ?|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Performance gap between gcc and msvc ?
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Fri, 18 Jun 2010 15:30:52 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=EqHmJ9jOOuzfemV6wMHMu2Httq0iRxWJTH3TIFUHr6k=; b=A86QgyDIo8I4wcVnKDf0Veun6DzlUzXwT2+c5vp/PcOGkaIFR/5wCKvLCVWnwW1uQg PikfLJ6xkn6goYL2Y/P9eTIlixMccPtm+ExmNhpgdX/Idhy6u7E/8m7tCRAZsDYAM+m0 476b6GWDDXxwDJ6x5aS4gVHd45PdkBkFNH6yY=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=v+fKJJRrWzil3KlfyEmAV6cZpa32YVyRoC5aefQxyBKk/rC033O4XzK5oMTFloJpl9 R+5FhVywsMPQkLLbGCJLXWzxMXJuMQr0yK4St3atpaYXgN47NZ9WxhtOb6xV0vPRsD2q dGapJfZJ/yJZjUXPUFLHT6DwFfhOg1DPNzcA4=
2010/6/18 David Roundy <roundyd@xxxxxxxxxxxxxxxxxxxxxxx>:
> On Fri, Jun 18, 2010 at 12:09 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> This smells like code tuned for larger CPU caches than your Core i5 has. Indeed:
>> - you are using very large matrices, so it's crucial that blocks fit
>> in the cpu caches.
>> - Core i5 are mass market cpus with presumably not too big caches.
>> Try finding out the size of your caces (e.g. cat /proc/cpuinfo on
>> linux) and playing with Eigen's cache size settings (see recent thread
> Is this something that could be done automatically at runtime?
It's nontrivial, but it's not unthinkable that we could get a sensible
default computed automatically. We have solved the big problem of
where to store state in a template library, by storing state as static
local vars in functions. The next problem is how to find out the cache
size on each platform we aim to support. Of course, at the very best,
we could get a sensible default, but in many cases only the user can
really know what value is right, since the cpu cache is going to be
shared with other threads and processes.