Re: [eigen] Back from google

[ Thread Index | Date Index | More Archives ]

2009/11/4 Keir Mierle <mierle@xxxxxxxxx>:
> On Wed, Nov 4, 2009 at 12:19 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
> wrote:
>> 2009/11/4 Keir Mierle <mierle@xxxxxxxxx>:
>> >> * Just use sqrt() ? The questions are: are we OK to always pay for a
>> >> sqrt here? This seems like unneeded extra performance degradation for
>> >> small dyn-size matrices (of course it's negligible for large
>> >> matrices). In the default case of a compile-time constant, GCC >= 4..3
>> >> is able to compute the sqrt at compile-time, but i wonder about MSVC
>> >> and ICC.
>> >
>> > If the variable is encapsulated in a binary library, then the square
>> > root
>> > can also be stored in a variable so that it is only recomputed if the
>> > size
>> > changes; for example:
>> > void Eigen::SetCacheBlockSize(int new_size) {
>> >   ei_cache_block_size = new_size;
>> >   ei_sqrt_cache_block_size = sqrt(new_size)
>> > }
>> >>
>> >> * Introduce a separate preprocessor #define to let the user specify a
>> >> runtime cache size variable name?
>> >>
>> >> Or going farther into the direction of binary libraries:
>> >>
>> >> * Introduce a preprocessor symbol to define some global variables,
>> >> e.g. one for the cpu cache size? We'd need to tell the user that if he
>> >> wants to use the corresponding feature, then one of his source file
>> >> must define that macro before #including Eigen.
>> >>
>> >> There is of course always the option of creating an optional tiny
>> >> binary lib, but i still don't see a compelling reason to do that...
>> >
>> > I don't think a small binary library is crazy, provided it is optional
>> > and
>> > disabled by default.
>> "optional and disabled by default" works well indeed to address needs
>> of a company like Google where someone will take care of enabling that
>> feature. But it doesn't work so well with linux distros who will want
>> to stick to the default configuration, or will somehow pressure us to
>> make that the default as soon as some package requires eigen to be
>> configured with this option.
>> So i'm not ruling out a binary library, but i wanted to mention the
>> downsides.
> There are two issues with packaging Eigen for linux distributions. The first
> is packaging the headers for developers to use in development of other
> applications. The second is packaging applications such as Blender or VTK
> that depend on Eigen.
> In the first case, there are no issues because there is only headers. In my
> proposed solution (below), this does not change because there would not be a
> separate binary library.
> In the second case, at the moment, there is also no issue because Eigen has
> no binary component; apps which depend on Eigen have Eigen's code
> intermingled with their application code via templates. However, this isn't
> optimal because only one binary is shipped to all users. The predefined
> cache size won't provide the best performance for everyone.
> Unfortunately, creating a binary package for Eigen is clearly undesirable
> from multiple perspectives. It removes control about how Eigen is compiled
> from downstream developers; it increases packaging burden; it decreases
> flexibility; etc.
> I suggest instead is that we provide a "EigenInternals.cpp" or similar file,
> which must be #included by the application somewhere. EigenInternals.cpp
> will include Eigen specific global variables, and functions to manipulate
> them.

ah ok, then our proposed solutions are essentially the same (my macro
solution, or what you describe). The only difference is between asking
the user to #include a file, or putting the contents of that file in a
macro for the user to put somewhere.

> That way, for the first case, nothing changes because the EigenInternal.cpp
> file is distributed alongside the include files. No binary is required. In
> the second case, since each binary application will have linked its own
> private copy of the EigenInternal.cpp, they can each be compiled with
> different options.
> This is also a win for anyone releasing binaries for Windows, where casual
> recompilation is rare.
> Thoughts?

I sure agree with your proposal since it only differs from mine in
aesthetic ways. Actually I can see how your proposal is a bit nicer in
that asking the user to #include a cpp file makes it pretty
self-explanatory what is happening. One remark though: the user might
just as well add that .cpp file directly to his build, right? so that
is actually even better, as that means that for those who need it, no
source modification is needed. actually this is exactly like providing
a binary library but without the disadvantages.

>> > This would also open the door to us determining cache
>> > sizes at startup, so the user doesn't have to.
>> Correct me if i'm wrong, as i'm really not an expert, but i thought
>> that knowing the CPU cache size only was enough information if we can
>> count on having all the CPU cache for ourselves in 1 single thread,
>> right? In general, it seemed to me that only the user could tell how
>> much cpu cache we could count on using. So, how is it useful to
>> determine the cpu cache size automatically?
> I don't know the details, but I believe the cache is yours until the next
> context switch.

If I am understanding correctly, it is yours so much that you can
freely kick everyone else out of it. Since that applies also to
concurrent threads, that means that a concurrent thread may kick your
own stuff out of cache, and the conclusion would be that when multiple
threads are running, every of them must be careful to use only a small
chunk of memory at a time: cache misses would still happen when a
thread pulls a new chunk of memory into the cache and kicks stuff out
of it, but that would only happen occasionnally and not at every
context switch.

More knowledgeable people here: Is the above correct?


>> > It's extra maintenance, but
>> > going forward it may be required if we wish Eigen to get used in
>> > prebuilt
>> > software. For example, consider packages depending on Eigen compiled for
>> > debian, or libmv when eventually bundled with blender.
>> I do understand that the current solution of having the cache size
>> known at build time, is not satisfactory :)
>> > There was another item brought up by a Googler, who couldn't make lunch,
>> > which was that there aren't any benchmarks on the wiki comparing sparse
>> > performance. It would be nice if the benchmark suite included
>> > comparisons
>> > against, e.g. gmm++ and ublas. I realize most of the solvers are
>> > implemented
>> > by other backends, but things like sparse matrix multiplication is still
>> > done natively.
>> ---> that would be nice indeed! And an accessible "junior job" for
>> someone wanting to contribute! The next question is if that can be
>> done in BTL, that i dont know.
>> > Thanks for joining us for lunch!
>> Thanks a ton for the invitation!
>> Everyone: Google offices are up to their reputation, the place is a
>> sort of hacker paradise with palm trees, free food, and technical chat
>> all over the place, from what i could see.
>> Benoit
>> > Keir
>> >>
>> >>
>> >> Benoit
>> >>
>> >>
>> >
>> >

Mail converted by MHonArc 2.6.19+