*Subject*: Re: [eigen] Automatic cache and block size determination*From*: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>*Date*: Tue, 22 Jun 2010 11:26:41 +0200

Well, let me give you some numbers. The time to query the L1 and L2 cache sizes at runtime is 0.5ms. This is done only once per execution of your software (in case you perform matrix products on dynamic size matrices). The overhead to test these queries have already been done is only one "if(something==0)". This is completely negligible compared to all the other computations which have to be carried out before doing the actual matrix product. They include the computation of the block sizes (which depend of the sizes of the matrices): std::ptrdiff_t l1, l2; ei_manage_caching_sizes(GetAction, &l1, &l2); // cost = 1 "if" (cheap) k = std::min<std::ptrdiff_t>(k, l1/kdiv); // kdiv is a power of 2 => 1 bit shift (cheap) std::ptrdiff_t _m = l2/(4 * sizeof(LhsScalar) * k); // this integer division cannot be avoid even if L1 and L2 are known at compile time if(_m<m) m = _m & mr_mask; then we have several allocations of the blocks: Scalar* blockA = ei_aligned_stack_new(Scalar, kc*mc); std::size_t sizeB = kc*Blocking::PacketSize*Blocking::nr + kc*cols; Scalar* allocatedBlockB = ei_aligned_stack_new(Scalar, sizeB); Scalar* blockB = allocatedBlockB + kc*Blocking::PacketSize*Blocking::nr; then the data are copied into these blocks, etc. So really, I think this little "if" is totally negligible. What could be really useful, however, is a way to instantiate a "matrix product object" with some information on the maximal and/or typical matrix sizes we are considering such that all the above initialization cost can be avoided when doing many matrix products on matrices having the same sizes. For instance, this could be useful for blocked decompositions. gael On Tue, Jun 22, 2010 at 9:39 AM, <bernard.hugueney@xxxxxxxxxx> wrote: > > Hi, > > > > On Tue, 22 Jun 2010 00:46:22 +0200, Thomas Capricelli > > > >> I guess there is a very small runtime cost of 'checking if the cache > > sizes > >> have already been computed' for every product, right ? > >> And also, computation involving cache sizes were previously done at > >> compile time and not anymore.. ? Nothing to worry about ? > > > > If possible, it would be best to have a #define L2_CACHE_SIZE that would > > default to > > a runtime query at static initialization time but could be set when > > compiling. > > > > When set at compile time, a typed value (as in [0]) would enable meta > > programming unrolling, > > when set at runtime doing it at static initialization time would avoid > > polluting other code > > with a check run only once. > > > > My .2€ > > > > Best regards, > > > > Bernard > > > > [0] http://www.boost.org/doc/libs/1_43_0/libs/mpl/doc/refmanual/int.html > > >

