Re: [eigen] Back from google

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2009/11/3 Rohit Garg <rpg.314@xxxxxxxxx>:
> On Wed, Nov 4, 2009 at 7:23 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> Hi,
>>
>> I'm just back from Google offices in Mountain View where I had lunch
>> with 4 Googlers including Keir.
>>
>> It's very likely that they start using Eigen in a few months for a few
>> projects, so the idea is that they start right away using (a
>> pre-release of) Eigen 3.0 (more on that later, i want us to start
>> discussing roadmaps soon but i need to merge my fork first).
>
> Google using eigen3, :)
>>
>> So the very vague timeline is that at some point during Spring 2010 we
>> release usable betas of eigen 3.0 with a 99% finalized API and they
>> start playing with that.
>>
>> Now let's discuss what came out of the conversation, as to how to make
>> Eigen 3 most useful to them.
>>
>>
>>
>> 1. The biggest requirement is to guarantee reliability/precision. A
>> big part of that is already happening, e.g. that's part of what the
>> refactor_solve_api fork is about. But we need to go beyond that and
>> actually i had been thinking for a while that, in addition to our
>> current unit tests that don't aim to test precision, we need a bunch
>> of precision-oriented tests. Over the past months, as i was more and
>> more concerned with that, I've already written several small programs
>> to test that, so i have to turn them into official unit tests.
>> Moreover one of the Googlers (was it Keir?) had a great idea: a great
>> test would be to run the big BLAS/LAPACK test suites against a
>> BLAS/LAPACK lib implemented on top of Eigen. Since Gael already
>> implemented BLAS level 1 and 3 on top of Eigen (the blas/ directory),
>> we're already almost able to start doing that! Also, when we write
>> custom precision-oriented tests, we could have a peek at the datafiles
>> of the LAPACK test suite, and steal their tricky matrices.
>>
>>
>>
>> 2. They also had a requirement that a single executable should be able
>> to run fast on various hardware with various cache size. So here the
>> big problem is that we currently only allow to specify the cache size
>> as a compile time constant, namely a preprocessor #define
>> EIGEN_TUNE_FOR_CPU_CACHE_SIZE. One of the Googlers (Matthew, i think)
>> suggested that it might be as easy as defining
>> EIGEN_TUNE_FOR_CPU_CACHE_SIZE to be the name of some variable defined
>> in the application. Thinking again about it, there is one little
>> problem here: we are passing that to ei_meta_sqrt, so currently it has
>> to be a compile time constant. In the case of a runtime parameter, the
>> sqrt would have to be computed at runtime. What should we do?
>
> This is not hard, but we will have to sacrifice *some* source
> portability for it. We can use the CPUID instruction to set this
> parameter. This will take some x86 assembly, so there is an immediate
> problem of MSVC style inline assembly vs GCC inline assembly. Of
> course, PowerPC and ARM won't like it. We could also make this option
> available at compile-time, ie an EIGEN_USE_CPUID compiler flag.

This is interesting to know, but keep in mind that a given
process/thread can only use the part of the CPU cache that is not
being used by other processes/threads. For that reason, i think that
all we want to do, is to let the user specify how much cache he thinks
we should expect to have, i.e. put that responsibility on the user,
instead of trying to determine that automatically.

Then another indirect advantage of that is that we don't sacrifice any
portability ;)

Benoit




>>
>> * Just use sqrt() ? The questions are: are we OK to always pay for a
>> sqrt here? This seems like unneeded extra performance degradation for
>> small dyn-size matrices (of course it's negligible for large
>> matrices). In the default case of a compile-time constant, GCC >= 4.3
>> is able to compute the sqrt at compile-time, but i wonder about MSVC
>> and ICC.
>>
>> * Introduce a separate preprocessor #define to let the user specify a
>> runtime cache size variable name?
>>
>> Or going farther into the direction of binary libraries:
>>
>> * Introduce a preprocessor symbol to define some global variables,
>> e.g. one for the cpu cache size? We'd need to tell the user that if he
>> wants to use the corresponding feature, then one of his source file
>> must define that macro before #including Eigen.
>>
>> There is of course always the option of creating an optional tiny
>> binary lib, but i still don't see a compelling reason to do that...
>>
>>
>> Benoit
>>
>>
>>
>
>
>
> --
> Rohit Garg
>
> http://rpg-314.blogspot.com/
>
> Senior Undergraduate
> Department of Physics
> Indian Institute of Technology
> Bombay
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/