Re: [eigen] Propagating Max*AtCompileTime template parameters to decompositions

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]




On Fri, Mar 5, 2010 at 3:36 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
2010/3/5 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>
>
> 2010/3/5 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>
>>
>>
>> On Thu, Mar 4, 2010 at 8:00 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>> wrote:
>>>
>>> One more thought.
>>>
>>> The "fixed sizes are guaranteed to not cause malloc's" principle only
>>> applies for not-too-large sizes. Let me explain. For large enough
>>> sizes (typically, above 200x200, but that is highly variable), the
>>> matrix-matrix product implementation works on blocks that fit in the
>>> CPU cache. These blocks are allocated with ei_aligned_stack_alloc
>>> which on non-Linux translates to a malloc (or friends).
>>>
>>> So in such unit tests, you should set the MaxSizes to some values that
>>> are significantly smaller than at least the default cache size
>>> (256x256 for float, but that could vary in the future). Just to be
>>> safe and since that only makes the test run faster, stay within very
>>> small sizes like 16x16...
>>
>> My use case requires matrices smaller than 50x50, usually half that size,
>> so we're good here :). For the unit test I'll make them 16x16 floats, no
>> problem.
>
> FYI, the optimized matrix product which requires dynamic allocation starts
> to be used for 8x8 matrices (even 4x4 on AltiVec).

What?! How then can we guarantee our users that at least for
not-too-big fixed size matrices, no malloc occurs?

Is this really a performance improvement even for such small sizes?

yes it is. for not too big matrices we have alloca. yes I know currently only for linux... If that's really a pb, there are many solutions:

1- we can still allocate on the stack the required blocks *outside* the matrix product function and pass them to the matrix product function. That must be done outside because the underlying heavy matrix product function should be instanciated only once per scalar type.

2- another solution: in the matrix product function we declare a (only when alloca is disabled):

Scalar stackmemory[16*16*2];  // no heap alloc for matrices up to 16x16
if the required memory for the blocks is bigger than 16*16*2 then do heap allocation...

3- the user can still define EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD to a big value to disable the use of the efficient product for matrices of medium sizes with a lost of performance...


btw, I have a bench program which instanciates and compares the coeff-based product to the optimized one for all triplet of fixed size matrices, for sizes between 1 and 9 => 2*9^3 product instanciations !!! The goal is to find the ideal threshold for each platform. hm... I should also add the transposed versions => x4 more instanciations, and versions for double... ouch!  that will take a day to compile :)

gael


 

Benoit

>
> So in your case, if you are running linux then alloca is used and you are
> fine. Otherwise, heap allocation occurs. If that's really an issue we can
> think about solutions...
>
> gael
>
>>
>> Adolfo
>>
>>>
>>> Benoit
>>>
>>> 2010/3/4 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> > 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>>> >>
>>> >>
>>> >> On Thu, Mar 4, 2010 at 6:00 PM, Benoit Jacob
>>> >> <jacob..benoit.1@xxxxxxxxx>
>>> >> wrote:
>>> >>>
>>> >>> Supreme refinement is to add a unit test for that.. We already have
>>> >>> the
>>> >>> skeleton in the nomalloc test (test/nomalloc.cpp).. Indeed, when
>>> >>> EIGEN_NO_MALLOC is defined, any heap allocation will cause an assert
>>> >>> failure, which in the context of unit tests we convert into an
>>> >>> exception.
>>> >>>
>>> >>> So don't hesitate to expand the nomalloc.cpp test, just with new
>>> >>> executable targets (CALL_SUBTEST_x for a new value of x up to 16) so
>>> >>> that no single executable gets too big. Testing only 'float' is good
>>> >>> enough.
>>> >>
>>> >>
>>> >> I was just thinking of that. I'll try to squeeze some time for that,
>>> >> too ;).
>>> >> Thanks for pointing me to the skeleton, that will speed up things.
>>> >> Should
>>> >> EIGEN_STACK_ALLOCATION_LIMIT 0 be defined as well, or am I good to go
>>> >> with
>>> >> only EIGEN_NO_MALLOC?.
>>> >
>>> > #define EIGEN_STACK_ALLOCATION_LIMIT 0 is really important. Indeed, on
>>> > Linux, this is really a stack allocation hence bypasses malloc
>>> > completely. So if you don't define this to 0, you could have a unit
>>> > test that passes on Linux but fails (because a heap allocation
>>> > happens) on other platforms.
>>> >
>>> > Benoit
>>> >
>>> >> The tests I've been playing with were triggering
>>> >> assertions made by EIGEN_NO_MALLOC only.
>>> >>
>>> >>>
>>> >>> Benoit
>>> >>>
>>> >>> 2010/3/4 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> >>> > 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>>> >>> >>
>>> >>> >>
>>> >>> >> On Thu, Mar 4, 2010 at 5:25 PM, Benoit Jacob
>>> >>> >> <jacob.benoit.1@xxxxxxxxx>
>>> >>> >> wrote:
>>> >>> >>>
>>> >>> >>> Hi Adolfo,
>>> >>> >>>
>>> >>> >>> Many thanks for this patch. It's been sitting in the back of my
>>> >>> >>> mind
>>> >>> >>> nagging me for a while :) These fixes are definitely wanted.
>>> >>> >>>
>>> >>> >>> Just 2 things:
>>> >>> >>>
>>> >>> >>> 1. All the MatrixType:: is a bit heavy so could you please
>>> >>> >>> introduce
>>> >>> >>> enums like is being done in ColPivHouseholderQR.h, in other
>>> >>> >>> decompositions? So MatrixType::MaxBlabla becomes just MaxBlaBla.
>>> >>> >>
>>> >>> >> OK, no problem. The enum approach is definitely much cleaner. I
>>> >>> >> just
>>> >>> >> didn't
>>> >>> >> want to impose it if it was not used in some files :)
>>> >>> >>
>>> >>> >> One more thing, now that I'm on it, are there any other files that
>>> >>> >> I
>>> >>> >> might
>>> >>> >> have overlooked that could benefit from similar changes?
>>> >>> >
>>> >>> > Any decomposition is affected in the same way. Have you had a look
>>> >>> > at
>>> >>> > the Eigenvalues directory?
>>> >>> >
>>> >>> >
>>> >>> >>
>>> >>> >>>
>>> >>> >>> 2. You generated this patch with "diff" so it doesn't credit you,
>>> >>> >>> and
>>> >>> >>> you haven't written your commit msg. Could you please follow the
>>> >>> >>> instructions given there:
>>> >>> >>>
>>> >>> >>>
>>> >>> >>>
>>> >>> >>> http://eigen.tuxfamily.org/index.php?title=Developer%27s_Corner#Generating_a_patch
>>> >>> >>>
>>> >>> >>
>>> >>> >> Ack.
>>> >>> >>
>>> >>> >> Adolfo
>>> >>> >>
>>> >>> >>>
>>> >>> >>> Yes, this mailing list is the right place for such patches... the
>>> >>> >>> issue tracker is useful to report bugs without being immediately
>>> >>> >>> able
>>> >>> >>> to fix them.
>>> >>> >>>
>>> >>> >>> Benoit
>>> >>> >>>
>>> >>> >>>
>>> >>> >>> 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>>> >>> >>> > Hi all,
>>> >>> >>> >
>>> >>> >>> > I've been performing some tests with dynamic matrices that have
>>> >>> >>> > a
>>> >>> >>> > compile-time max size, and found out that for most matrix
>>> >>> >>> > decompositions,
>>> >>> >>> > the MaxRowsAtCompileTime and MaxColsAtCompileTime template
>>> >>> >>> > parameters of
>>> >>> >>> > the
>>> >>> >>> > input matrix are not propagated to the decomposition's members
>>> >>> >>> > and
>>> >>> >>> > temporaries. This leads to undesired and preventable heap
>>> >>> >>> > allocations.
>>> >>> >>> > The
>>> >>> >>> > following is a minimal example of what I mean:
>>> >>> >>> >
>>> >>> >>> > typedef Matrix<double, dynamic, dynamic, 0, 10, 10> Mat;
>>> >>> >>> > Mat A(Mat::Random(5, 5));
>>> >>> >>> >
>>> >>> >>> > FullPivLU<Mat> ldlt(A); // Allocates 4 temps
>>> >>> >>> > LDLT<Mat> ldlt(A);      // Allocates 3 temps
>>> >>> >>> >
>>> >>> >>> > I'm attaching a patch against the devel branch (commit
>>> >>> >>> > 2589:8a8f4e9e04bf)
>>> >>> >>> > that deals with this issue for the Cholesky, LU, QR and SVD
>>> >>> >>> > modules.
>>> >>> >>> > Since
>>> >>> >>> > most of these modules are experimental and undergoing heavy
>>> >>> >>> > work,
>>> >>> >>> > the
>>> >>> >>> > style
>>> >>> >>> > of the modifications changes form file to file (some use
>>> >>> >>> > convenience
>>> >>> >>> > enums,
>>> >>> >>> > some not). However, I tried to stay consistent within the
>>> >>> >>> > context of
>>> >>> >>> > each
>>> >>> >>> > file. Please let me know your thoughts on the matter.
>>> >>> >>>
>>> >>> >>> Yes so actually, please introduce enums everywhere :)
>>> >>> >>>
>>> >>> >>> > One very specific
>>> >>> >>> > note: in line 65 of HouseHolderQR.h I left the Options template
>>> >>> >>> > parameter
>>> >>> >>> > as-is, but don't know if it should be changed to that of
>>> >>> >>> > MatrixType.
>>> >>> >>>
>>> >>> >>> Great catch! Please change that to Options (as defined on line
>>> >>> >>> 57).
>>> >>> >>>
>>> >>> >>> Thanks
>>> >>> >>> Benoit
>>> >>> >>>
>>> >>> >>> >
>>> >>> >>> > Cheers,
>>> >>> >>> >
>>> >>> >>> > Adolfo.
>>> >>> >>> >
>>> >>> >>> > P.S. Is there a better way to submit small patches, something
>>> >>> >>> > like a
>>> >>> >>> > bugtracker, or is this the currently recommended way?
>>> >>> >>> >
>>> >>> >>> > --
>>> >>> >>> > Adolfo Rodríguez Tsouroukdissian, Ph. D.
>>> >>> >>> >
>>> >>> >>> > Robotics engineer
>>> >>> >>> > PAL ROBOTICS S.L
>>> >>> >>> > http://www.pal-robotics.com
>>> >>> >>> > Tel. +34.93.414.53.47
>>> >>> >>> > Fax.+34.93.209.11.09
>>> >>> >>> > AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos
>>> >>> >>> > adjuntos,
>>> >>> >>> > pueden
>>> >>> >>> > contener información privilegiada y/o confidencial que está
>>> >>> >>> > dirigida
>>> >>> >>> > exclusivamente a su destinatario. Si usted recibe este mensaje
>>> >>> >>> > y no
>>> >>> >>> > es
>>> >>> >>> > el
>>> >>> >>> > destinatario indicado, o el empleado encargado de su entrega a
>>> >>> >>> > dicha
>>> >>> >>> > persona, por favor, notifíquelo inmediatamente y remita el
>>> >>> >>> > mensaje
>>> >>> >>> > original
>>> >>> >>> > a la dirección de correo electrónico indicada. Cualquier copia,
>>> >>> >>> > uso
>>> >>> >>> > o
>>> >>> >>> > distribución no autorizados de esta comunicación queda
>>> >>> >>> > estrictamente
>>> >>> >>> > prohibida.
>>> >>> >>> >
>>> >>> >>> > CONFIDENTIALITY NOTICE: This e-mail and the accompanying
>>> >>> >>> > document(s)
>>> >>> >>> > may
>>> >>> >>> > contain confidential information which is privileged and
>>> >>> >>> > intended
>>> >>> >>> > only
>>> >>> >>> > for
>>> >>> >>> > the individual or entity to whom they are addressed.  If you
>>> >>> >>> > are not
>>> >>> >>> > the
>>> >>> >>> > intended recipient, you are hereby notified that any
>>> >>> >>> > disclosure,
>>> >>> >>> > copying,
>>> >>> >>> > distribution or use of this e-mail and/or accompanying
>>> >>> >>> > document(s)
>>> >>> >>> > is
>>> >>> >>> > strictly prohibited.  If you have received this e-mail in
>>> >>> >>> > error,
>>> >>> >>> > please
>>> >>> >>> > immediately notify the sender at the above e-mail address.
>>> >>> >>> >
>>> >>> >>>
>>> >>> >>>
>>> >>> >>
>>> >>> >>
>>> >>> >
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Adolfo Rodríguez Tsouroukdissian, Ph. D.
>>> >>
>>> >> Robotics engineer
>>> >> PAL ROBOTICS S.L
>>> >> http://www.pal-robotics.com
>>> >> Tel. +34.93.414.53.47
>>> >> Fax.+34.93.209.11.09
>>> >> AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
>>> >> pueden
>>> >> contener información privilegiada y/o confidencial que está dirigida
>>> >> exclusivamente a su destinatario. Si usted recibe este mensaje y no es
>>> >> el
>>> >> destinatario indicado, o el empleado encargado de su entrega a dicha
>>> >> persona, por favor, notifíquelo inmediatamente y remita el mensaje
>>> >> original
>>> >> a la dirección de correo electrónico indicada. Cualquier copia, uso o
>>> >> distribución no autorizados de esta comunicación queda estrictamente
>>> >> prohibida.
>>> >>
>>> >> CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s)
>>> >> may
>>> >> contain confidential information which is privileged and intended only
>>> >> for
>>> >> the individual or entity to whom they are addressed.  If you are not
>>> >> the
>>> >> intended recipient, you are hereby notified that any disclosure,
>>> >> copying,
>>> >> distribution or use of this e-mail and/or accompanying document(s) is
>>> >> strictly prohibited.  If you have received this e-mail in error,
>>> >> please
>>> >> immediately notify the sender at the above e-mail address.
>>> >>
>>> >
>>>
>>>
>
>





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/