Re: [eigen] Malloc-free dynamic matrices

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]



Thank you for your suggestions. I have to admit I've also thought about adding a memory allocator template parameter as this sounds very powerful, but finally this is going to very complicated mainly because as said Hauke we would not be able to reuse STL's allocator design, and so we would have to come up with a new one.... sounds very complicated for everybody, so I'd say that's not worth the effort... especially since we now agreed on a very simple and practical solution.

gael


On Fri, Mar 5, 2010 at 12:10 PM, leon zadorin <leonleon77@xxxxxxxxx> wrote:
On 3/5/10, leon zadorin <leonleon77@xxxxxxxxx> wrote:
> On 3/5/10, Eamon Nerbonne <eamon.nerbonne@xxxxxxxxx> wrote:
>> I'm against adding extra fields to MatrixXf and would prefer avoiding
>> binary
>> library.
>
> If some want MatrixXf to have memory caching/reservation and don't
> mind extra flags; whilst others don't want extra flags -- what are the
> reasons for not simply having a template type/policy called
> MemoryAllocator which will be used by MatrixXf (similar to STL/::std
> types using allocators)...
>
> Any fancy memory reservation/caching is then relegated to the context
> of the allocator type/policy...
>
> ... and anyone who wants to have advanced
> block-reallocation/caching/etc. may do:
> MatrixXf<CustomSuperDuperAllocator...>
> and live with extra members/overhead that this allocator implies;
>
> whilst others may have MatrixXf<SmallBasicAllocator...> and not pay
> the price of any extra things of a custom/fancy allocator.

One, last (I promise) suggestion: if allocator policy is to be used
and people really don't want any extra memory to be used by MatrixXf
with basic allocation, then instead of

template <typename Allocator>
struct Matrix {
Allocator a;
Matrix()
{
void *x = a.alloc();
}
};

it'd be better to do
template <typename Allocator>
struct Matrix : Allocator {
Matrix
{
void *x = Allocator::alloc();
}
};

Because, even if a given basic Allocator is an empty class (i.e. no
data members whatsoever), the 1st case will use extra 1 to 8 bytes of
memory (i.e. in C++ a complete object cannot have size 0), but in
second case an EBO (empty-base-optimization) will be automatically
used by the compiler to guarantee that 0 extra bytes are associated
with an given empty (e.g. all static methods) Allocator sub-type
-object.

Anyway... it's just for the extra-pedantic -paranoid ones out there :-)

I shall be silent now :-)

Kind regards
Leon.

> The ABI, of course, may be an issue -- but STL/std code seems to use
> this approach anyway... Besides header-only libs and single-definition
> rule in C++... they kinda make me feel that, as Benoit mentioned, if
> you gonna worry about ABI -- then might as well lock-it-in in stone
> for a while (e.g. major ver. num of the lib)... just like any
> interface or a protocol (e.g. IP) -- changing structural things is
> done rarely and with great "transitional" latency :-)
>
> ... may be I have missed something, or may be this is already how you
> guys are doing it in Eigen, sorry -- it's not really that important to
> me either way...
>
> ... personally I'll be using Map<> et al quite frequently... if for no
> other reason but to also be able to allocate/position multiple
> distinct matricies (which are frequently used together in some
> formula) in a continuous block of memory (i.e. tightly-located w.r.t.
> each other -- one after another, thusly further optimizing the
> CPU-cache utilization due to the fact that cpu's cacheline size is > 1
> byte)... similar to reasons why frequently-used vars in structs are
> usually grouped together as well...
>
> anyway -- I'll shut up now. Keep up the good work and thanks for
> writing a great library!
>
> leon.
>
>> The binary library is simply a simplicity issue; it's very nice not to
>> have
>> to add an extra build step.  Also, due to limitations of make, it
>> sometimes
>> causes things to break (say, when you change calling conventions in the
>> makefile and the binary module isn't rebuilt since the source files
>> haven't
>> changed) and it's just an extra complexity that's nice to avoid.  If it's
>> only necessary to permit ABI compatibility in the face of changes to
>> eigen,
>> I'd prefer to have the choice of losing the ABI guarantee and gaining the
>> simplicity.
>>
>> The extra fields in MatrixXf will probably cause performance degradation
>> for
>> a few not-unreasonable use cases.  When using eigen, one might deal with
>> small statically size matrices - and all is well.  If you're dealing with
>> large dynamically sized matrices, the extra overhead won't matter.  In
>> between, it will:
>>
>> If you're dealing with many matrices+vectors of small (but statically
>> unknown) size, right now, the best option is plain MatrixXf.  It may seem
>> that Matrix<float,Dynamic,Dynamic,0,16,16> is a reasonable alternative
>> (and
>> avoids dynamic allocation to boot!), but that approach reserves the full
>> 16x16 matrix even for small cases; so if you're dealing with lots of
>> small
>> vectors and matrices, that's going to be huge (and impractical) memory
>> inflation.  Also, even though the matrix actually reserves 256 elements,
>> it
>> will assert an error if presented with a 3x17 matrix (I haven't checked
>> whether this actually causes a real error if the assert isn't on,
>> probably
>> not).  To use dynamic-size matrices with fixed storage right now, you'd
>> need
>> to have a healthy padding in both rows and cols.  Further, your code is
>> suddenly less reusable - what if you encounter the exceptional case that
>> exceeds the predicted limits?  If you just use MatrixXf, it works (though
>> possibly slowly) - but if you use a fixed-size storage, it will silently
>> corrupt in NDEBUG mode - not nice since it's likely to occur that
>> *expects*variable sized input.
>>
>> So, I don't think it's a safe assumption that MatrixXf will not be used
>> for
>> small matrices - it's the only real option for small, variable-sized
>> matrices, in fact (unless you have very few and memory inflation doesn't
>> matter).
>>
>>
>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>
>>
>> On Fri, Mar 5, 2010 at 09:24, Gael Guennebaud
>> <gael.guennebaud@xxxxxxxxx>wrote:
>>
>>>
>>>
>>> On Fri, Mar 5, 2010 at 6:25 AM, Benoit Jacob
>>> <jacob.benoit.1@xxxxxxxxx>wrote:
>>>
>>>> ok i can finally reply.....
>>>>
>>>> first of all let me say that my idea of
>>>> d-pointers-without-a-shared-library fails because of the case when a
>>>> program would first create a Eigen 3.0 matrix and then load (at
>>>> runtime) a library that uses Eigen 3.1. In other words, it only works
>>>> as long as any library loading predates any matrix creation...
>>>>
>>>> 2010/3/4 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>>>> >
>>>> > Ok, let me try to summarize. As far as I understood, the led us to
>>>> > two
>>>> > options:
>>>> >
>>>> > Option1: we add a tiny shared library implementing the
>>>> > creation/initialization of the D_structure and a method to get the
>>>> > D_structure.
>>>>
>>>> Yes; nitpick: rather, offers methods to get the individual data
>>>> members in the D_structure. Getting the d pointer itself is useless,
>>>> precisely as you don't want to have to use hardcoded offsets.
>>>>
>>>> > Then where and how it is store is a detail. This approach
>>>> > offers full flexibility in the future but requires a shared lib.
>>>>
>>>> Yes.
>>>>
>>>> >
>>>> > Option2: if we don't want a shared lib, then there is no way we can
>>>> change
>>>> > the size of the D_structure and/or the way it is initialized.
>>>>
>>>> I'm afraid this is true, although I didn't realize this earlier.
>>>>
>>>> Initially I though that we could at least add more data members, that
>>>> the problem was only that we couldn't remove/change existing members.
>>>> But I realize I was wrong. One can't even add data members. That would
>>>> already require putting them in a d-pointer.
>>>>
>>>> > So we can
>>>> > still keep some flexibility (~10% ?) by deciding for 3.0 to reserve
>>>> > some
>>>> > extra bytes for future uses.
>>>>
>>>> I realize, now, why you were hardcoding a fixed amount of reserved
>>>> space. It's indeed the only thing we can do in that direction.
>>>>
>>>> I'm just wondering what we're going to do with that, and if we're not
>>>> going to be forced to write a separate class or Options template
>>>> parameter anyway whenever we need to do anything serious....
>>>>
>>>> > Those bytes will be initialized to 0, and in
>>>> > the future their default values must still be zero.
>>>>
>>>> hm i see. in case the old version creates a matrix that is then used
>>>> by the new version.
>>>>
>>>> > In this case, there is
>>>> > indeed no need to store them on the heap, and we can store them on
>>>> > the
>>>> stack
>>>> > as members of Matrix.
>>>>
>>>> yes.
>>>>
>>>> My conclusion:
>>>>  - adding a shared library would be giving up a very large advantage
>>>> that we have. Being purely headers makes it far easier for people to
>>>> use Eigen.
>>>>
>>>
>>> Adding -leigen is not far more complicated. It's only more complicated
>>> for
>>> people following the devel branch because they have to do a "hg pull -u
>>> &&
>>> cd build && make && make install && cd -". But ok, as long as we can
>>> safely
>>> workaround without relying on complex mechanisms, then it's definitely
>>> better not to add a shared lib.
>>>
>>> Second degree: if we add a shared lib no more how do I compile Eigen ? I
>>> run make but nothing happened ? ;)
>>>
>>>  - once allocatedSize member is added, i don't see any more potential
>>>> reason on the horizon for changing the Matrix ABI, hence no potential
>>>> use case for a d-pointer.
>>>>  - if such a use case happens it's always possible to do that with a
>>>> new template parameter / class.
>>>>  - any fixed amount of reserved space can still fail to be enough the
>>>> day we need it, while having a constant cost. Hence, I'm not
>>>> convinced.
>>>>
>>>> So my opinion: Option2 without any reserved space, just set in stone
>>>> the matrix ABI after you've implemented this change.
>>>>
>>>
>>> I'd still add one integer flag. We can always add new flags while
>>> ensuring
>>> that 0 is the default, and I already have a few ideas for such runtime
>>> flags:
>>> - block heap reallocation
>>> - mark the object as temporary. E.g., when a user return a MatrixXf by
>>> value => Matrix::operator= can use a cheap swap instead of a full copy:
>>>
>>> MatrixXf foo() { MatrixXf ret; /*...*/ ret.markTemporary(); return ret;}
>>>
>>> MatrixXf y;
>>> /* ... */
>>> y = foo();
>>>
>>> - etc.
>>>
>>> gael.
>>>
>>>
>>>>
>>>> Benoit
>>>>
>>>> >
>>>> > gael.
>>>> >
>>>> >
>>>> > On Thu, Mar 4, 2010 at 10:43 PM, Benoit Jacob
>>>> > <jacob..benoit.1@xxxxxxxxx
>>>> >
>>>> > wrote:
>>>> >>
>>>> >> 2010/3/4 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>>>> >> >> >> > alloc:
>>>> >> >> >> > m_data = ei_aligned_new(size+<16 bytes>) + <16 bytes>;
>>>> >> >> >> > allocatedSize() = size;
>>>> >> >> >> >
>>>> >> >> >> > dealloc:
>>>> >> >> >> > ei_aligned_delete(m_data-<16bytes>);
>>>> >> >> >> >
>>>> >> >> >> > int& allocatedSize() {return (m_data-<16bytes>);}
>>>> >> >> >> >
>>>> >> >> >> > Disclaimer: yes the above is not C++, it is just to picture
>>>> >> >> >> > the
>>>> >> >> >> > idea!
>>>> >> >> >>
>>>> >> >> >> This looks like going only halfway toward heap-stored data.
>>>> Instead,
>>>> >> >> >> why not take the bolder move of adding a d-pointer? We would
>>>> >> >> >> put
>>>> >> >> >> there
>>>> >> >> >> any additional data that is OK to access with non-inline
>>>> functions.
>>>> >> >> >> So
>>>> >> >> >> we would keep directly as data members the array pointer
>>>> >> >> >> m_data
>>>> and
>>>> >> >> >> the dimensions m_rows and m_cols so we can still call
>>>> >> >> >> rows()/cols()/data() at zero cost (useful as they are used all
>>>> the
>>>> >> >> >> time) but other less frequently used data could be deferred
>>>> >> >> >> onto
>>>> the
>>>> >> >> >> d-pointer and accessed through no-inline accessors.
>>>> >> >> >
>>>> >> >> > To be honest I don't see how adding a d-pointer can offer more
>>>> >> >> > flexibility.
>>>> >> >>
>>>> >> >> With your approach, any additional member data that we may want
>>>> >> >> to
>>>> add
>>>> >> >> in the future, has to fit in the fixed number of bytes that were
>>>> >> >> reserved, like 8 bytes or 16 bytes. We have to decide once and
>>>> >> >> for
>>>> all
>>>> >> >> how much space we reserve for additional members. Moreover, once
>>>> we've
>>>> >> >> added a member, we have to keep its offset fixed forever. All of
>>>> that
>>>> >> >> can theoretically be overcome by using a d-pointer.
>>>> >> >
>>>> >> > hm... maybe I've been  clear but as "my proposal" I was referring
>>>> >> > to
>>>> the
>>>> >> > solution of storing the D_structure in the dynamically allocated
>>>> memory,
>>>> >> > i.e., with the data.
>>>> >>
>>>> >> Well let's look at your pseudo code:
>>>> >>
>>>> >>  m_data = ei_aligned_new(size+<16 bytes>) + <16 bytes>;
>>>> >>  allocatedSize() = size;
>>>> >>
>>>> >>  dealloc:
>>>> >>  ei_aligned_delete(m_data-<16bytes>);
>>>> >>
>>>> >> If I understand correctly, you're reserving a fixed amount of memory
>>>> >> (here 16 bytes) for the D_structure just before the location pointed
>>>> >> to by m_data. So yes it's on the heap, that's what I understood, but
>>>> >> you still hardcode the number of bytes that your D_structure may
>>>> >> have.
>>>> >>
>>>> >> >> > My proposal affords the same with less memory and runtime
>>>> overhead: a
>>>> >> >> > true
>>>> >> >> > d-pointer would requires in addition one pointer, one call to
>>>> malloc,
>>>> >> >>
>>>> >> >> Yep, I thought about that just after sending the e-mail. The
>>>> solution
>>>> >> >> might be to merge this idea with your idea: allocate at once the
>>>> >> >> matrix array and the D_structure. But in order to allow the
>>>> >> >> D_structure to grow in the future, place it after the array, not
>>>> >> >> before, and access it only with non-inline accessors .... now
>>>> >> >> here's
>>>> >> >> the catch... that must be compiled into a shared library :( I
>>>> >> >> didn't
>>>> >> >> think about that in my previous e-mail, but the d-pointer
>>>> >> >> approach
>>>> can
>>>> >> >> only work if we have a binary shared library :( Though at that
>>>> point,
>>>> >> >> having such a tiny library would solve a bunch of problems at
>>>> >> >> once
>>>> >> >> (cache size parameters, etc). I don't know what to think about
>>>> >> >> that.
>>>> >> >
>>>> >> > You cannot easily put it at the end because ideally we would store
>>>> the
>>>> >> > allocatedSize variable in the D_structure, and if you put it at
>>>> >> > the
>>>> end
>>>> >> > of
>>>> >> > the data, you need the allocatedSize to access to the
>>>> >> > D_structure...
>>>> >>
>>>> >> Well yes, in my proposal of putting the D_structure at the end, we
>>>> >> have to add a new data member to Matrix, which can be either the
>>>> >> offset or why not directly the pointer to the D_structure. But I
>>>> >> don't
>>>> >> think that it should be the allocatedSize that we should store, and
>>>> >> actually it still wouldn't be too convenient to address the
>>>> >> D_structure (need to take padding into account...)
>>>> >>
>>>> >> Then, from the moment we're storing 2 pointers, m_data and m_d, it
>>>> >> doesn't matter anymore which one is at the beginning and which one
>>>> >> is
>>>> >> at the end of the buffer.
>>>> >>
>>>> >> Is that a big deal to add one more data member to MatrixXf...?
>>>> >>
>>>> >> Though in that vein, one might go further and ask why we're
>>>> >> preferring
>>>> >> to put stuff on the heap at all, why not just add plain data members
>>>> >> to MatrixXf...? I'm not sure why sizeof(MatrixXf) matters more than
>>>> >> the size of the allocated buffer.
>>>> >>
>>>> >> > Since
>>>> >> > this whole approach  can only work via a shared library,
>>>> >>
>>>> >> ...if we want a real d-pointer. Without a shared lib, we can still
>>>> >> have a D_structure, it's just that the application using Eigen
>>>> >> hardcodes the D_structure data layout at compile time, so we don't
>>>> >> get
>>>> >> the flexibility of a d-pointer.
>>>> >>
>>>> >> >> I'm completely hesitating, I can't make a decision on that. I
>>>> >> >> guess
>>>> >> >> that if we treat this issue simultaneously with other issues that
>>>> >> >> would benefit from a binary lib, such as cache size runtime
>>>> >> >> parameters, then the case for a binary lib gets quite strong. On
>>>> >> >> the
>>>> >> >> other hand it will require good communication and documentation,
>>>> >> >> it
>>>> >> >> would be great to keep it optional (maybe make its code
>>>> >> >> optionally
>>>> >> >> available as a header file...), and it should be WTFPL-licensed.
>>>> >> >
>>>> >> > Same here, though I become more and more in favor to a shared
>>>> >> > library
>>>> as
>>>> >> > it
>>>> >> > might solve many issues.
>>>> >>
>>>> >> i don't know... above we're discussing a very good solution without
>>>> >> a
>>>> >> binary lib, and below you have a great idea for the cache size
>>>> >> problem
>>>> >> too:
>>>> >>
>>>> >> >
>>>> >> > Regarding runtime settings without a shared lib, I was thinking
>>>> >> > about
>>>> >> > using
>>>> >> > a static variable inside a function:
>>>> >> >
>>>> >> > // internal
>>>> >> > int manage_cache_size(enum action,int v=0)
>>>> >> > {
>>>> >> >  static int value = EIGEN_DEFAULT_CACHE_SIZE;
>>>> >> >   if(action="" value = v;
>>>> >> >   if(action="" return value;
>>>> >> >   return value;
>>>> >> >
>>>> >> > }
>>>> >> >
>>>> >> > // public:
>>>> >> > int cacheSize() { return manage_cache_size(get); }
>>>> >> > void setCacheSize(int v) { manage_cache_size(set,v); }
>>>> >> >
>>>> >> > but I'm really unsure about that...
>>>> >>
>>>> >> wow, that looks like a great idea!
>>>> >>
>>>> >> Such a static variable in a function, works exactly like a global
>>>> >> variable from a library as far as we're concerned... as far as I can
>>>> >> see.
>>>> >>
>>>> >> Benoit
>>>> >>
>>>> >>
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>
>>
>





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/