Re: [eigen] Performance gap between gcc and msvc ? |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Performance gap between gcc and msvc ?
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Fri, 18 Jun 2010 17:31:43 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=wlXRMosUvpE0m6WmY/JGpj4YOr9f2qJoFEG6buX4gAw=; b=mNd0eABsGmspW2DJBKpeTKAXYHHAuLbDHTYXC/yp/eGKLyffIevJBbW1ASTctHFQNa EBfBPw+YqJyJUOy/u/T22iNHp0b23gWFvYai1QutNZjEm6Onw5jzi7Bz0nKvgpR5Wnt3 UlPq4AJQ+xG1xTY6kaldAMSidZtwmqw6WcWKs=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=ZzUcpMINOV7VdtLK8I6bOcGwdD+IGXmEWH+zRpQ7oI120A++uStUfZblNwL7Cg7m0+ Rl6/ryPgJTixDvqtOC9Hc3a55r5vrbG3Kd8HUvVFzjG3tMQpHOnperRXgnPDmGQCI4Gw SScr/njxNE3bzkS0vL3cuK5dbIgVEM16TTmqQ=
on linux, actually, in the worst case, we can always
fopen("/proc/cpuinfo"). Will be slow, but any system call is slow, and
it's something we do once only.
Benoit
2010/6/18 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> yes, that be very nice. Does anyone know to query the cache sizes? One
> solution would be to write some assembly to query the CPUID and then
> manage our own table but well I hope there exist something simpler!
>
> gael
>
> On Fri, Jun 18, 2010 at 9:42 PM, David Roundy
> <roundyd@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
>> It does seem like it'd be worth trying to come up with a good
>> heuristic. Among other advantages, it'd be very nice to be able to
>> create a single binary that can effectively be run on a variety of
>> CPUs. Of course, one could probably do all right by just picking the
>> smallest cache... but something picked at runtime ought to be able to
>> beat that pretty easily!
>>
>> (Speaking as someone who runs a pretty heterogeneous cluster of workstations...)
>>
>> David
>>
>> On Fri, Jun 18, 2010 at 12:30 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>> 2010/6/18 David Roundy <roundyd@xxxxxxxxxxxxxxxxxxxxxxx>:
>>>> On Fri, Jun 18, 2010 at 12:09 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>>> This smells like code tuned for larger CPU caches than your Core i5 has. Indeed:
>>>>> - you are using very large matrices, so it's crucial that blocks fit
>>>>> in the cpu caches.
>>>>> - Core i5 are mass market cpus with presumably not too big caches.
>>>>>
>>>>> Try finding out the size of your caces (e.g. cat /proc/cpuinfo on
>>>>> linux) and playing with Eigen's cache size settings (see recent thread
>>>>> here).
>>>>
>>>> Is this something that could be done automatically at runtime?
>>>
>>> It's nontrivial, but it's not unthinkable that we could get a sensible
>>> default computed automatically. We have solved the big problem of
>>> where to store state in a template library, by storing state as static
>>> local vars in functions. The next problem is how to find out the cache
>>> size on each platform we aim to support. Of course, at the very best,
>>> we could get a sensible default, but in many cases only the user can
>>> really know what value is right, since the cpu cache is going to be
>>> shared with other threads and processes.
>>>
>>> Benoit
>>>
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> David Roundy
>>
>>
>>
>
>
>