Re: [eigen] Propagating Max*AtCompileTime template parameters to decompositions |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Propagating Max*AtCompileTime template parameters to decompositions
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Fri, 5 Mar 2010 10:53:51 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=FkiYkmFpSViy4AXBfzp6LVmgoyElk4iOGjcSG6Q5VY4=; b=pOuE+EPJoRsefH8OUyC3EwJUZqCAhxPXnIw+w11AK6co983dEMqixdnDycZWY4ez4k Ji9G6p7HOxbI2aXMSmgtllMM8XYew0nxmR9RE5g9uT/42QtDTO280ZjoLxY+0Bbie8F9 NOVaFKUFcADdwQEjpbik1e7vnakDMnTVxIvv8=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=eysEJtJUDHFgOPRDtHTtXGTIrHesBcbWvj8HCCdviufO9zIgzyFV7YeZOOSUry3CtX JMYhoCw5RqSU7lyri5DaN42gFXFnSpgcdAbx13iUWSTBrSb5HffeLla7ba+DNAZok8OP BQXXG4r3qEOQ7c4qDhUw04cOQyu8XcsPDWODI=
Gael: on second thought: couldn't you just let the cache_friendly
functions take a 'workspace' parameter so they don't have to allocate
it themselves? The workspace would be created by the caller who knows
the sizes-at-compile-time hence is able to create a static array if
the matrices have fixed max-size...
Benoit
2010/3/5 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>
>
> On Fri, Mar 5, 2010 at 3:36 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
> wrote:
>>
>> 2010/3/5 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>> >
>> >
>> > 2010/3/5 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>
>> >>
>> >>
>> >> On Thu, Mar 4, 2010 at 8:00 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>> >> wrote:
>> >>>
>> >>> One more thought.
>> >>>
>> >>> The "fixed sizes are guaranteed to not cause malloc's" principle only
>> >>> applies for not-too-large sizes. Let me explain. For large enough
>> >>> sizes (typically, above 200x200, but that is highly variable), the
>> >>> matrix-matrix product implementation works on blocks that fit in the
>> >>> CPU cache. These blocks are allocated with ei_aligned_stack_alloc
>> >>> which on non-Linux translates to a malloc (or friends).
>> >>>
>> >>> So in such unit tests, you should set the MaxSizes to some values that
>> >>> are significantly smaller than at least the default cache size
>> >>> (256x256 for float, but that could vary in the future). Just to be
>> >>> safe and since that only makes the test run faster, stay within very
>> >>> small sizes like 16x16...
>> >>
>> >> My use case requires matrices smaller than 50x50, usually half that
>> >> size,
>> >> so we're good here :). For the unit test I'll make them 16x16 floats,
>> >> no
>> >> problem.
>> >
>> > FYI, the optimized matrix product which requires dynamic allocation
>> > starts
>> > to be used for 8x8 matrices (even 4x4 on AltiVec).
>>
>> What?! How then can we guarantee our users that at least for
>> not-too-big fixed size matrices, no malloc occurs?
>>
>> Is this really a performance improvement even for such small sizes?
>
> yes it is. for not too big matrices we have alloca. yes I know currently
> only for linux... If that's really a pb, there are many solutions:
>
> 1- we can still allocate on the stack the required blocks *outside* the
> matrix product function and pass them to the matrix product function. That
> must be done outside because the underlying heavy matrix product function
> should be instanciated only once per scalar type.
>
> 2- another solution: in the matrix product function we declare a (only when
> alloca is disabled):
>
> Scalar stackmemory[16*16*2]; // no heap alloc for matrices up to 16x16
> if the required memory for the blocks is bigger than 16*16*2 then do heap
> allocation...
>
> 3- the user can still define EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD to a big
> value to disable the use of the efficient product for matrices of medium
> sizes with a lost of performance...
>
>
> btw, I have a bench program which instanciates and compares the coeff-based
> product to the optimized one for all triplet of fixed size matrices, for
> sizes between 1 and 9 => 2*9^3 product instanciations !!! The goal is to
> find the ideal threshold for each platform. hm... I should also add the
> transposed versions => x4 more instanciations, and versions for double....
> ouch! that will take a day to compile :)
>
> gael
>
>
>
>>
>> Benoit
>>
>> >
>> > So in your case, if you are running linux then alloca is used and you
>> > are
>> > fine. Otherwise, heap allocation occurs. If that's really an issue we
>> > can
>> > think about solutions...
>> >
>> > gael
>> >
>> >>
>> >> Adolfo
>> >>
>> >>>
>> >>> Benoit
>> >>>
>> >>> 2010/3/4 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> >>> > 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>> >>> >>
>> >>> >>
>> >>> >> On Thu, Mar 4, 2010 at 6:00 PM, Benoit Jacob
>> >>> >> <jacob.benoit.1@xxxxxxxxx>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> Supreme refinement is to add a unit test for that. We already have
>> >>> >>> the
>> >>> >>> skeleton in the nomalloc test (test/nomalloc.cpp). Indeed, when
>> >>> >>> EIGEN_NO_MALLOC is defined, any heap allocation will cause an
>> >>> >>> assert
>> >>> >>> failure, which in the context of unit tests we convert into an
>> >>> >>> exception.
>> >>> >>>
>> >>> >>> So don't hesitate to expand the nomalloc.cpp test, just with new
>> >>> >>> executable targets (CALL_SUBTEST_x for a new value of x up to 16)
>> >>> >>> so
>> >>> >>> that no single executable gets too big. Testing only 'float' is
>> >>> >>> good
>> >>> >>> enough.
>> >>> >>
>> >>> >>
>> >>> >> I was just thinking of that. I'll try to squeeze some time for
>> >>> >> that,
>> >>> >> too ;).
>> >>> >> Thanks for pointing me to the skeleton, that will speed up things..
>> >>> >> Should
>> >>> >> EIGEN_STACK_ALLOCATION_LIMIT 0 be defined as well, or am I good to
>> >>> >> go
>> >>> >> with
>> >>> >> only EIGEN_NO_MALLOC?.
>> >>> >
>> >>> > #define EIGEN_STACK_ALLOCATION_LIMIT 0 is really important. Indeed,
>> >>> > on
>> >>> > Linux, this is really a stack allocation hence bypasses malloc
>> >>> > completely. So if you don't define this to 0, you could have a unit
>> >>> > test that passes on Linux but fails (because a heap allocation
>> >>> > happens) on other platforms.
>> >>> >
>> >>> > Benoit
>> >>> >
>> >>> >> The tests I've been playing with were triggering
>> >>> >> assertions made by EIGEN_NO_MALLOC only.
>> >>> >>
>> >>> >>>
>> >>> >>> Benoit
>> >>> >>>
>> >>> >>> 2010/3/4 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> >>> >>> > 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >> On Thu, Mar 4, 2010 at 5:25 PM, Benoit Jacob
>> >>> >>> >> <jacob.benoit.1@xxxxxxxxx>
>> >>> >>> >> wrote:
>> >>> >>> >>>
>> >>> >>> >>> Hi Adolfo,
>> >>> >>> >>>
>> >>> >>> >>> Many thanks for this patch. It's been sitting in the back of
>> >>> >>> >>> my
>> >>> >>> >>> mind
>> >>> >>> >>> nagging me for a while :) These fixes are definitely wanted.
>> >>> >>> >>>
>> >>> >>> >>> Just 2 things:
>> >>> >>> >>>
>> >>> >>> >>> 1. All the MatrixType:: is a bit heavy so could you please
>> >>> >>> >>> introduce
>> >>> >>> >>> enums like is being done in ColPivHouseholderQR.h, in other
>> >>> >>> >>> decompositions? So MatrixType::MaxBlabla becomes just
>> >>> >>> >>> MaxBlaBla.
>> >>> >>> >>
>> >>> >>> >> OK, no problem. The enum approach is definitely much cleaner. I
>> >>> >>> >> just
>> >>> >>> >> didn't
>> >>> >>> >> want to impose it if it was not used in some files :)
>> >>> >>> >>
>> >>> >>> >> One more thing, now that I'm on it, are there any other files
>> >>> >>> >> that
>> >>> >>> >> I
>> >>> >>> >> might
>> >>> >>> >> have overlooked that could benefit from similar changes?
>> >>> >>> >
>> >>> >>> > Any decomposition is affected in the same way. Have you had a
>> >>> >>> > look
>> >>> >>> > at
>> >>> >>> > the Eigenvalues directory?
>> >>> >>> >
>> >>> >>> >
>> >>> >>> >>
>> >>> >>> >>>
>> >>> >>> >>> 2. You generated this patch with "diff" so it doesn't credit
>> >>> >>> >>> you,
>> >>> >>> >>> and
>> >>> >>> >>> you haven't written your commit msg. Could you please follow
>> >>> >>> >>> the
>> >>> >>> >>> instructions given there:
>> >>> >>> >>>
>> >>> >>> >>>
>> >>> >>> >>>
>> >>> >>> >>>
>> >>> >>> >>> http://eigen.tuxfamily.org/index.php?title=Developer%27s_Corner#Generating_a_patch
>> >>> >>> >>>
>> >>> >>> >>
>> >>> >>> >> Ack.
>> >>> >>> >>
>> >>> >>> >> Adolfo
>> >>> >>> >>
>> >>> >>> >>>
>> >>> >>> >>> Yes, this mailing list is the right place for such patches....
>> >>> >>> >>> the
>> >>> >>> >>> issue tracker is useful to report bugs without being
>> >>> >>> >>> immediately
>> >>> >>> >>> able
>> >>> >>> >>> to fix them.
>> >>> >>> >>>
>> >>> >>> >>> Benoit
>> >>> >>> >>>
>> >>> >>> >>>
>> >>> >>> >>> 2010/3/4 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>> >>> >>> >>> > Hi all,
>> >>> >>> >>> >
>> >>> >>> >>> > I've been performing some tests with dynamic matrices that
>> >>> >>> >>> > have
>> >>> >>> >>> > a
>> >>> >>> >>> > compile-time max size, and found out that for most matrix
>> >>> >>> >>> > decompositions,
>> >>> >>> >>> > the MaxRowsAtCompileTime and MaxColsAtCompileTime template
>> >>> >>> >>> > parameters of
>> >>> >>> >>> > the
>> >>> >>> >>> > input matrix are not propagated to the decomposition's
>> >>> >>> >>> > members
>> >>> >>> >>> > and
>> >>> >>> >>> > temporaries. This leads to undesired and preventable heap
>> >>> >>> >>> > allocations.
>> >>> >>> >>> > The
>> >>> >>> >>> > following is a minimal example of what I mean:
>> >>> >>> >>> >
>> >>> >>> >>> > typedef Matrix<double, dynamic, dynamic, 0, 10, 10> Mat;
>> >>> >>> >>> > Mat A(Mat::Random(5, 5));
>> >>> >>> >>> >
>> >>> >>> >>> > FullPivLU<Mat> ldlt(A); // Allocates 4 temps
>> >>> >>> >>> > LDLT<Mat> ldlt(A); // Allocates 3 temps
>> >>> >>> >>> >
>> >>> >>> >>> > I'm attaching a patch against the devel branch (commit
>> >>> >>> >>> > 2589:8a8f4e9e04bf)
>> >>> >>> >>> > that deals with this issue for the Cholesky, LU, QR and SVD
>> >>> >>> >>> > modules.
>> >>> >>> >>> > Since
>> >>> >>> >>> > most of these modules are experimental and undergoing heavy
>> >>> >>> >>> > work,
>> >>> >>> >>> > the
>> >>> >>> >>> > style
>> >>> >>> >>> > of the modifications changes form file to file (some use
>> >>> >>> >>> > convenience
>> >>> >>> >>> > enums,
>> >>> >>> >>> > some not). However, I tried to stay consistent within the
>> >>> >>> >>> > context of
>> >>> >>> >>> > each
>> >>> >>> >>> > file. Please let me know your thoughts on the matter.
>> >>> >>> >>>
>> >>> >>> >>> Yes so actually, please introduce enums everywhere :)
>> >>> >>> >>>
>> >>> >>> >>> > One very specific
>> >>> >>> >>> > note: in line 65 of HouseHolderQR.h I left the Options
>> >>> >>> >>> > template
>> >>> >>> >>> > parameter
>> >>> >>> >>> > as-is, but don't know if it should be changed to that of
>> >>> >>> >>> > MatrixType.
>> >>> >>> >>>
>> >>> >>> >>> Great catch! Please change that to Options (as defined on line
>> >>> >>> >>> 57).
>> >>> >>> >>>
>> >>> >>> >>> Thanks
>> >>> >>> >>> Benoit
>> >>> >>> >>>
>> >>> >>> >>> >
>> >>> >>> >>> > Cheers,
>> >>> >>> >>> >
>> >>> >>> >>> > Adolfo.
>> >>> >>> >>> >
>> >>> >>> >>> > P.S. Is there a better way to submit small patches,
>> >>> >>> >>> > something
>> >>> >>> >>> > like a
>> >>> >>> >>> > bugtracker, or is this the currently recommended way?
>> >>> >>> >>> >
>> >>> >>> >>> > --
>> >>> >>> >>> > Adolfo Rodríguez Tsouroukdissian, Ph. D.
>> >>> >>> >>> >
>> >>> >>> >>> > Robotics engineer
>> >>> >>> >>> > PAL ROBOTICS S.L
>> >>> >>> >>> > http://www.pal-robotics.com
>> >>> >>> >>> > Tel. +34.93.414.53.47
>> >>> >>> >>> > Fax.+34.93.209.11.09
>> >>> >>> >>> > AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos
>> >>> >>> >>> > adjuntos,
>> >>> >>> >>> > pueden
>> >>> >>> >>> > contener información privilegiada y/o confidencial que está
>> >>> >>> >>> > dirigida
>> >>> >>> >>> > exclusivamente a su destinatario. Si usted recibe este
>> >>> >>> >>> > mensaje
>> >>> >>> >>> > y no
>> >>> >>> >>> > es
>> >>> >>> >>> > el
>> >>> >>> >>> > destinatario indicado, o el empleado encargado de su entrega
>> >>> >>> >>> > a
>> >>> >>> >>> > dicha
>> >>> >>> >>> > persona, por favor, notifíquelo inmediatamente y remita el
>> >>> >>> >>> > mensaje
>> >>> >>> >>> > original
>> >>> >>> >>> > a la dirección de correo electrónico indicada. Cualquier
>> >>> >>> >>> > copia,
>> >>> >>> >>> > uso
>> >>> >>> >>> > o
>> >>> >>> >>> > distribución no autorizados de esta comunicación queda
>> >>> >>> >>> > estrictamente
>> >>> >>> >>> > prohibida.
>> >>> >>> >>> >
>> >>> >>> >>> > CONFIDENTIALITY NOTICE: This e-mail and the accompanying
>> >>> >>> >>> > document(s)
>> >>> >>> >>> > may
>> >>> >>> >>> > contain confidential information which is privileged and
>> >>> >>> >>> > intended
>> >>> >>> >>> > only
>> >>> >>> >>> > for
>> >>> >>> >>> > the individual or entity to whom they are addressed. If you
>> >>> >>> >>> > are not
>> >>> >>> >>> > the
>> >>> >>> >>> > intended recipient, you are hereby notified that any
>> >>> >>> >>> > disclosure,
>> >>> >>> >>> > copying,
>> >>> >>> >>> > distribution or use of this e-mail and/or accompanying
>> >>> >>> >>> > document(s)
>> >>> >>> >>> > is
>> >>> >>> >>> > strictly prohibited. If you have received this e-mail in
>> >>> >>> >>> > error,
>> >>> >>> >>> > please
>> >>> >>> >>> > immediately notify the sender at the above e-mail address.
>> >>> >>> >>> >
>> >>> >>> >>>
>> >>> >>> >>>
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >
>> >>> >>>
>> >>> >>>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Adolfo Rodríguez Tsouroukdissian, Ph. D.
>> >>> >>
>> >>> >> Robotics engineer
>> >>> >> PAL ROBOTICS S.L
>> >>> >> http://www.pal-robotics.com
>> >>> >> Tel. +34.93.414.53.47
>> >>> >> Fax.+34.93.209.11.09
>> >>> >> AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
>> >>> >> pueden
>> >>> >> contener información privilegiada y/o confidencial que está
>> >>> >> dirigida
>> >>> >> exclusivamente a su destinatario. Si usted recibe este mensaje y no
>> >>> >> es
>> >>> >> el
>> >>> >> destinatario indicado, o el empleado encargado de su entrega a
>> >>> >> dicha
>> >>> >> persona, por favor, notifíquelo inmediatamente y remita el mensaje
>> >>> >> original
>> >>> >> a la dirección de correo electrónico indicada. Cualquier copia, uso
>> >>> >> o
>> >>> >> distribución no autorizados de esta comunicación queda
>> >>> >> estrictamente
>> >>> >> prohibida.
>> >>> >>
>> >>> >> CONFIDENTIALITY NOTICE: This e-mail and the accompanying
>> >>> >> document(s)
>> >>> >> may
>> >>> >> contain confidential information which is privileged and intended
>> >>> >> only
>> >>> >> for
>> >>> >> the individual or entity to whom they are addressed. If you are
>> >>> >> not
>> >>> >> the
>> >>> >> intended recipient, you are hereby notified that any disclosure,
>> >>> >> copying,
>> >>> >> distribution or use of this e-mail and/or accompanying document(s)
>> >>> >> is
>> >>> >> strictly prohibited. If you have received this e-mail in error,
>> >>> >> please
>> >>> >> immediately notify the sender at the above e-mail address.
>> >>> >>
>> >>> >
>> >>>
>> >>>
>> >
>> >
>>
>>
>
>