Re: [eigen] Performance gap between gcc and msvc ?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


on linux, actually, in the worst case, we can always
fopen("/proc/cpuinfo"). Will be slow, but any system call is slow, and
it's something we do once only.

Benoit

2010/6/18 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> yes, that be very nice. Does anyone know to query the cache sizes? One
> solution would be to write some assembly to query the CPUID and then
> manage our own table but well I hope there exist something simpler!
>
> gael
>
> On Fri, Jun 18, 2010 at 9:42 PM, David Roundy
> <roundyd@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
>> It does seem like it'd be worth trying to come up with a good
>> heuristic.  Among other advantages, it'd be very nice to be able to
>> create a single binary that can effectively be run on a variety of
>> CPUs.  Of course, one could probably do all right by just picking the
>> smallest cache... but something picked at runtime ought to be able to
>> beat that pretty easily!
>>
>> (Speaking as someone who runs a pretty heterogeneous cluster of workstations...)
>>
>> David
>>
>> On Fri, Jun 18, 2010 at 12:30 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>> 2010/6/18 David Roundy <roundyd@xxxxxxxxxxxxxxxxxxxxxxx>:
>>>> On Fri, Jun 18, 2010 at 12:09 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>>> This smells like code tuned for larger CPU caches than your Core i5 has. Indeed:
>>>>>  - you are using very large matrices, so it's crucial that blocks fit
>>>>> in the cpu caches.
>>>>>  - Core i5 are mass market cpus with presumably not too big caches.
>>>>>
>>>>> Try finding out the size of your caces (e.g. cat /proc/cpuinfo on
>>>>> linux) and playing with Eigen's cache size settings (see recent thread
>>>>> here).
>>>>
>>>> Is this something that could be done automatically at runtime?
>>>
>>> It's nontrivial, but it's not unthinkable that we could get a sensible
>>> default computed automatically. We have solved the big problem of
>>> where to store state in a template library, by storing state as static
>>> local vars in functions. The next problem is how to find out the cache
>>> size on each platform we aim to support. Of course, at the very best,
>>> we could get a sensible default, but in many cases only the user can
>>> really know what value is right, since the cpu cache is going to be
>>> shared with other threads and processes.
>>>
>>> Benoit
>>>
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> David Roundy
>>
>>
>>
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/