Re: [eigen] Propagating Max*AtCompileTime template parameters to decompositions |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Propagating Max*AtCompileTime template parameters to decompositions
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Fri, 5 Mar 2010 10:57:22 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=ykrpspCRlrgu9hZ9yI6L44mtBUQlPYYVdOLuoLMi3c0=; b=lkNHZHfOgkYHBr/a60HvGyWzKvKL/ZGBVa2pud60AiYPdGwDQRzpvoCRtMmvBfFsLD qwK1wt/Pv+FMucefjNT3JP93bI3zjgW82ljwmHCX5yA5g5kYfDXJRcRH9TsW4hKM0kp/ dp+w+q6qxykVmgYKX4LygSo2zWTkUE1E+yJ2o=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=JBBEfm7a7WV2XRWKE6b8Kszfq1rzglO5qqIiSTnL2akHqnnpyo4XJkm6lMFGasPl7f 7OWHOwWW0iAp2TCi9BgLABLsUnoQd9ypP1e/bKkdk+5Qomd7DiKX7Xs8z9Vo1pbH0D02 unn81+YSdaU0TqbAkF8Klaq9UhAe4D8V5Q6EU=
oops, that was exactly your solution 1) if i understand well :)
yes, i totally agree :) it's the only solution to avoid hardcoding a
magic limit.
Benoit
2010/3/5 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> Gael: on second thought: couldn't you just let the cache_friendly
> functions take a 'workspace' parameter so they don't have to allocate
> it themselves? The workspace would be created by the caller who knows
> the sizes-at-compile-time hence is able to create a static array if
> the matrices have fixed max-size...
>
> Benoit
>
> 2010/3/5 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>>
>>
>> On Fri, Mar 5, 2010 at 3:36 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>> wrote:
>>>
>>> 2010/3/5 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>>> >
>>> >
>>> > 2010/3/5 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>
>>> >>
>>> >>
>>> >> On Thu, Mar 4, 2010 at 8:00 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>>> >> wrote:
>>> >>>
>>> >>> One more thought.
>>> >>>
>>> >>> The "fixed sizes are guaranteed to not cause malloc's" principle only
>>> >>> applies for not-too-large sizes. Let me explain. For large enough
>>> >>> sizes (typically, above 200x200, but that is highly variable), the
>>> >>> matrix-matrix product implementation works on blocks that fit in the
>>> >>> CPU cache. These blocks are allocated with ei_aligned_stack_alloc
>>> >>> which on non-Linux translates to a malloc (or friends).
>>> >>>
>>> >>> So in such unit tests, you should set the MaxSizes to some values that
>>> >>> are significantly smaller than at least the default cache size
>>> >>> (256x256 for float, but that could vary in the future). Just to be
>>> >>> safe and since that only makes the test run faster, stay within very
>>> >>> small sizes like 16x16...
>>> >>
>>> >> My use case requires matrices smaller than 50x50, usually half that
>>> >> size,
>>> >> so we're good here :). For the unit test I'll make them 16x16 floats,
>>> >> no
>>> >> problem.
>>> >
>>> > FYI, the optimized matrix product which requires dynamic allocation
>>> > starts
>>> > to be used for 8x8 matrices (even 4x4 on AltiVec).
>>>
>>> What?! How then can we guarantee our users that at least for
>>> not-too-big fixed size matrices, no malloc occurs?
>>>
>>> Is this really a performance improvement even for such small sizes?
>>
>> yes it is. for not too big matrices we have alloca. yes I know currently
>> only for linux... If that's really a pb, there are many solutions:
>>
>> 1- we can still allocate on the stack the required blocks *outside* the
>> matrix product function and pass them to the matrix product function. That
>> must be done outside because the underlying heavy matrix product function
>> should be instanciated only once per scalar type.
>>
>> 2- another solution: in the matrix product function we declare a (only when
>> alloca is disabled):
>>
>> Scalar stackmemory[16*16*2]; // no heap alloc for matrices up to 16x16
>> if the required memory for the blocks is bigger than 16*16*2 then do heap
>> allocation...
>>
>> 3- the user can still define EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD to a big
>> value to disable the use of the efficient product for matrices of medium
>> sizes with a lost of performance...
>>
>>
>> btw, I have a bench program which instanciates and compares the coeff-based
>> product to the optimized one for all triplet of fixed size matrices, for
>> sizes between 1 and 9 => 2*9^3 product instanciations !!! The goal is to
>> find the ideal threshold for each platform. hm... I should also add the
>> transposed versions => x4 more instanciations, and versions for double....
>> ouch! that will take a day to compile :)
>>
>> gael
>>
>>
>>
>>>
>>> Benoit
>>>
>>> >
>>> > So in your case, if you are running linux then alloca is used and you
>>> > are
>>> > fine. Otherwise, heap allocation occurs. If that's really an issue we
>>> > can
>>> > think about solutions...
>>> >
>>> > gael
>>> >
>>> >>
>>> >> Adolfo
>>> >>
>>> >>>
>>> >>> Benoit
>>> >>>
>>> >>> 2010/3/4 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> >>> > 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>>> >>> >>
>>> >>> >>
>>> >>> >> On Thu, Mar 4, 2010 at 6:00 PM, Benoit Jacob
>>> >>> >> <jacob.benoit.1@xxxxxxxxx>
>>> >>> >> wrote:
>>> >>> >>>
>>> >>> >>> Supreme refinement is to add a unit test for that. We already have
>>> >>> >>> the
>>> >>> >>> skeleton in the nomalloc test (test/nomalloc.cpp). Indeed, when
>>> >>> >>> EIGEN_NO_MALLOC is defined, any heap allocation will cause an
>>> >>> >>> assert
>>> >>> >>> failure, which in the context of unit tests we convert into an
>>> >>> >>> exception.
>>> >>> >>>
>>> >>> >>> So don't hesitate to expand the nomalloc.cpp test, just with new
>>> >>> >>> executable targets (CALL_SUBTEST_x for a new value of x up to 16)
>>> >>> >>> so
>>> >>> >>> that no single executable gets too big. Testing only 'float' is
>>> >>> >>> good
>>> >>> >>> enough.
>>> >>> >>
>>> >>> >>
>>> >>> >> I was just thinking of that. I'll try to squeeze some time for
>>> >>> >> that,
>>> >>> >> too ;).
>>> >>> >> Thanks for pointing me to the skeleton, that will speed up things.
>>> >>> >> Should
>>> >>> >> EIGEN_STACK_ALLOCATION_LIMIT 0 be defined as well, or am I good to
>>> >>> >> go
>>> >>> >> with
>>> >>> >> only EIGEN_NO_MALLOC?.
>>> >>> >
>>> >>> > #define EIGEN_STACK_ALLOCATION_LIMIT 0 is really important. Indeed,
>>> >>> > on
>>> >>> > Linux, this is really a stack allocation hence bypasses malloc
>>> >>> > completely. So if you don't define this to 0, you could have a unit
>>> >>> > test that passes on Linux but fails (because a heap allocation
>>> >>> > happens) on other platforms.
>>> >>> >
>>> >>> > Benoit
>>> >>> >
>>> >>> >> The tests I've been playing with were triggering
>>> >>> >> assertions made by EIGEN_NO_MALLOC only.
>>> >>> >>
>>> >>> >>>
>>> >>> >>> Benoit
>>> >>> >>>
>>> >>> >>> 2010/3/4 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> >>> >>> > 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>>> >>> >>> >>
>>> >>> >>> >>
>>> >>> >>> >> On Thu, Mar 4, 2010 at 5:25 PM, Benoit Jacob
>>> >>> >>> >> <jacob.benoit.1@xxxxxxxxx>
>>> >>> >>> >> wrote:
>>> >>> >>> >>>
>>> >>> >>> >>> Hi Adolfo,
>>> >>> >>> >>>
>>> >>> >>> >>> Many thanks for this patch. It's been sitting in the back of
>>> >>> >>> >>> my
>>> >>> >>> >>> mind
>>> >>> >>> >>> nagging me for a while :) These fixes are definitely wanted..
>>> >>> >>> >>>
>>> >>> >>> >>> Just 2 things:
>>> >>> >>> >>>
>>> >>> >>> >>> 1. All the MatrixType:: is a bit heavy so could you please
>>> >>> >>> >>> introduce
>>> >>> >>> >>> enums like is being done in ColPivHouseholderQR.h, in other
>>> >>> >>> >>> decompositions? So MatrixType::MaxBlabla becomes just
>>> >>> >>> >>> MaxBlaBla.
>>> >>> >>> >>
>>> >>> >>> >> OK, no problem. The enum approach is definitely much cleaner.. I
>>> >>> >>> >> just
>>> >>> >>> >> didn't
>>> >>> >>> >> want to impose it if it was not used in some files :)
>>> >>> >>> >>
>>> >>> >>> >> One more thing, now that I'm on it, are there any other files
>>> >>> >>> >> that
>>> >>> >>> >> I
>>> >>> >>> >> might
>>> >>> >>> >> have overlooked that could benefit from similar changes?
>>> >>> >>> >
>>> >>> >>> > Any decomposition is affected in the same way. Have you had a
>>> >>> >>> > look
>>> >>> >>> > at
>>> >>> >>> > the Eigenvalues directory?
>>> >>> >>> >
>>> >>> >>> >
>>> >>> >>> >>
>>> >>> >>> >>>
>>> >>> >>> >>> 2. You generated this patch with "diff" so it doesn't credit
>>> >>> >>> >>> you,
>>> >>> >>> >>> and
>>> >>> >>> >>> you haven't written your commit msg. Could you please follow
>>> >>> >>> >>> the
>>> >>> >>> >>> instructions given there:
>>> >>> >>> >>>
>>> >>> >>> >>>
>>> >>> >>> >>>
>>> >>> >>> >>>
>>> >>> >>> >>> http://eigen.tuxfamily.org/index.php?title=Developer%27s_Corner#Generating_a_patch
>>> >>> >>> >>>
>>> >>> >>> >>
>>> >>> >>> >> Ack.
>>> >>> >>> >>
>>> >>> >>> >> Adolfo
>>> >>> >>> >>
>>> >>> >>> >>>
>>> >>> >>> >>> Yes, this mailing list is the right place for such patches....
>>> >>> >>> >>> the
>>> >>> >>> >>> issue tracker is useful to report bugs without being
>>> >>> >>> >>> immediately
>>> >>> >>> >>> able
>>> >>> >>> >>> to fix them.
>>> >>> >>> >>>
>>> >>> >>> >>> Benoit
>>> >>> >>> >>>
>>> >>> >>> >>>
>>> >>> >>> >>> 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>>> >>> >>> >>> > Hi all,
>>> >>> >>> >>> >
>>> >>> >>> >>> > I've been performing some tests with dynamic matrices that
>>> >>> >>> >>> > have
>>> >>> >>> >>> > a
>>> >>> >>> >>> > compile-time max size, and found out that for most matrix
>>> >>> >>> >>> > decompositions,
>>> >>> >>> >>> > the MaxRowsAtCompileTime and MaxColsAtCompileTime template
>>> >>> >>> >>> > parameters of
>>> >>> >>> >>> > the
>>> >>> >>> >>> > input matrix are not propagated to the decomposition's
>>> >>> >>> >>> > members
>>> >>> >>> >>> > and
>>> >>> >>> >>> > temporaries. This leads to undesired and preventable heap
>>> >>> >>> >>> > allocations.
>>> >>> >>> >>> > The
>>> >>> >>> >>> > following is a minimal example of what I mean:
>>> >>> >>> >>> >
>>> >>> >>> >>> > typedef Matrix<double, dynamic, dynamic, 0, 10, 10> Mat;
>>> >>> >>> >>> > Mat A(Mat::Random(5, 5));
>>> >>> >>> >>> >
>>> >>> >>> >>> > FullPivLU<Mat> ldlt(A); // Allocates 4 temps
>>> >>> >>> >>> > LDLT<Mat> ldlt(A); // Allocates 3 temps
>>> >>> >>> >>> >
>>> >>> >>> >>> > I'm attaching a patch against the devel branch (commit
>>> >>> >>> >>> > 2589:8a8f4e9e04bf)
>>> >>> >>> >>> > that deals with this issue for the Cholesky, LU, QR and SVD
>>> >>> >>> >>> > modules.
>>> >>> >>> >>> > Since
>>> >>> >>> >>> > most of these modules are experimental and undergoing heavy
>>> >>> >>> >>> > work,
>>> >>> >>> >>> > the
>>> >>> >>> >>> > style
>>> >>> >>> >>> > of the modifications changes form file to file (some use
>>> >>> >>> >>> > convenience
>>> >>> >>> >>> > enums,
>>> >>> >>> >>> > some not). However, I tried to stay consistent within the
>>> >>> >>> >>> > context of
>>> >>> >>> >>> > each
>>> >>> >>> >>> > file. Please let me know your thoughts on the matter.
>>> >>> >>> >>>
>>> >>> >>> >>> Yes so actually, please introduce enums everywhere :)
>>> >>> >>> >>>
>>> >>> >>> >>> > One very specific
>>> >>> >>> >>> > note: in line 65 of HouseHolderQR.h I left the Options
>>> >>> >>> >>> > template
>>> >>> >>> >>> > parameter
>>> >>> >>> >>> > as-is, but don't know if it should be changed to that of
>>> >>> >>> >>> > MatrixType.
>>> >>> >>> >>>
>>> >>> >>> >>> Great catch! Please change that to Options (as defined on line
>>> >>> >>> >>> 57).
>>> >>> >>> >>>
>>> >>> >>> >>> Thanks
>>> >>> >>> >>> Benoit
>>> >>> >>> >>>
>>> >>> >>> >>> >
>>> >>> >>> >>> > Cheers,
>>> >>> >>> >>> >
>>> >>> >>> >>> > Adolfo.
>>> >>> >>> >>> >
>>> >>> >>> >>> > P.S. Is there a better way to submit small patches,
>>> >>> >>> >>> > something
>>> >>> >>> >>> > like a
>>> >>> >>> >>> > bugtracker, or is this the currently recommended way?
>>> >>> >>> >>> >
>>> >>> >>> >>> > --
>>> >>> >>> >>> > Adolfo Rodríguez Tsouroukdissian, Ph. D.
>>> >>> >>> >>> >
>>> >>> >>> >>> > Robotics engineer
>>> >>> >>> >>> > PAL ROBOTICS S.L
>>> >>> >>> >>> > http://www.pal-robotics.com
>>> >>> >>> >>> > Tel. +34.93.414.53.47
>>> >>> >>> >>> > Fax.+34.93.209.11.09
>>> >>> >>> >>> > AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos
>>> >>> >>> >>> > adjuntos,
>>> >>> >>> >>> > pueden
>>> >>> >>> >>> > contener información privilegiada y/o confidencial que está
>>> >>> >>> >>> > dirigida
>>> >>> >>> >>> > exclusivamente a su destinatario. Si usted recibe este
>>> >>> >>> >>> > mensaje
>>> >>> >>> >>> > y no
>>> >>> >>> >>> > es
>>> >>> >>> >>> > el
>>> >>> >>> >>> > destinatario indicado, o el empleado encargado de su entrega
>>> >>> >>> >>> > a
>>> >>> >>> >>> > dicha
>>> >>> >>> >>> > persona, por favor, notifíquelo inmediatamente y remita el
>>> >>> >>> >>> > mensaje
>>> >>> >>> >>> > original
>>> >>> >>> >>> > a la dirección de correo electrónico indicada. Cualquier
>>> >>> >>> >>> > copia,
>>> >>> >>> >>> > uso
>>> >>> >>> >>> > o
>>> >>> >>> >>> > distribución no autorizados de esta comunicación queda
>>> >>> >>> >>> > estrictamente
>>> >>> >>> >>> > prohibida.
>>> >>> >>> >>> >
>>> >>> >>> >>> > CONFIDENTIALITY NOTICE: This e-mail and the accompanying
>>> >>> >>> >>> > document(s)
>>> >>> >>> >>> > may
>>> >>> >>> >>> > contain confidential information which is privileged and
>>> >>> >>> >>> > intended
>>> >>> >>> >>> > only
>>> >>> >>> >>> > for
>>> >>> >>> >>> > the individual or entity to whom they are addressed. If you
>>> >>> >>> >>> > are not
>>> >>> >>> >>> > the
>>> >>> >>> >>> > intended recipient, you are hereby notified that any
>>> >>> >>> >>> > disclosure,
>>> >>> >>> >>> > copying,
>>> >>> >>> >>> > distribution or use of this e-mail and/or accompanying
>>> >>> >>> >>> > document(s)
>>> >>> >>> >>> > is
>>> >>> >>> >>> > strictly prohibited. If you have received this e-mail in
>>> >>> >>> >>> > error,
>>> >>> >>> >>> > please
>>> >>> >>> >>> > immediately notify the sender at the above e-mail address..
>>> >>> >>> >>> >
>>> >>> >>> >>>
>>> >>> >>> >>>
>>> >>> >>> >>
>>> >>> >>> >>
>>> >>> >>> >
>>> >>> >>>
>>> >>> >>>
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >> --
>>> >>> >> Adolfo Rodríguez Tsouroukdissian, Ph. D.
>>> >>> >>
>>> >>> >> Robotics engineer
>>> >>> >> PAL ROBOTICS S.L
>>> >>> >> http://www.pal-robotics.com
>>> >>> >> Tel. +34.93.414.53.47
>>> >>> >> Fax.+34.93.209.11.09
>>> >>> >> AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
>>> >>> >> pueden
>>> >>> >> contener información privilegiada y/o confidencial que está
>>> >>> >> dirigida
>>> >>> >> exclusivamente a su destinatario. Si usted recibe este mensaje y no
>>> >>> >> es
>>> >>> >> el
>>> >>> >> destinatario indicado, o el empleado encargado de su entrega a
>>> >>> >> dicha
>>> >>> >> persona, por favor, notifíquelo inmediatamente y remita el mensaje
>>> >>> >> original
>>> >>> >> a la dirección de correo electrónico indicada. Cualquier copia, uso
>>> >>> >> o
>>> >>> >> distribución no autorizados de esta comunicación queda
>>> >>> >> estrictamente
>>> >>> >> prohibida.
>>> >>> >>
>>> >>> >> CONFIDENTIALITY NOTICE: This e-mail and the accompanying
>>> >>> >> document(s)
>>> >>> >> may
>>> >>> >> contain confidential information which is privileged and intended
>>> >>> >> only
>>> >>> >> for
>>> >>> >> the individual or entity to whom they are addressed. If you are
>>> >>> >> not
>>> >>> >> the
>>> >>> >> intended recipient, you are hereby notified that any disclosure,
>>> >>> >> copying,
>>> >>> >> distribution or use of this e-mail and/or accompanying document(s)
>>> >>> >> is
>>> >>> >> strictly prohibited. If you have received this e-mail in error,
>>> >>> >> please
>>> >>> >> immediately notify the sender at the above e-mail address.
>>> >>> >>
>>> >>> >
>>> >>>
>>> >>>
>>> >
>>> >
>>>
>>>
>>
>>
>