Re: [eigen] Status of givens QR decomposition

On Wed, Apr 28, 2010 at 1:56 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:

2010/4/28 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:

>
>
> On Tue, Apr 27, 2010 at 1:31 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
> wrote:
>>
>> 2010/4/27 Adolfo Rodríguez Tsouroukdissian <dofo79@xxxxxxxxx>:
>> >
>> >
>> > On Tue, Jan 26, 2010 at 1:52 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>> > wrote:
>> >>
>> >> 2010/1/26 Thomas Capricelli <orzel@xxxxxxxxxxxxxxx>:
>> >> >
>> >> > As a following to:
>> >> >
>> >> >
>> >> > http://listengine..tuxfamily.org/lists.tuxfamily.org/eigen/2009/09/msg00009.html
>> >> >
>> >> > I'd like to say that i have a very important use case for givens-QR
>> >> > with
>> >> > big matrices in the NonlinearOptimization module. This is useful in
>> >> > the
>> >> > variant of the Levenberg Marquardt algorithm for very big matrices,
>> >> > so-called 'storage efficient': the qr decomposition is done with
>> >> > givens
>> >> > rotation, presumably because this is the only decomposition that can
>> >> > be done
>> >> > incrementally : the matrix to decompose is not known as a whole, but
>> >> > is
>> >> > constructed line by line, 'updating' the qr decomposition and
>> >> > deleting the
>> >> > line.
>> >>
>> >> Interesting!!
>> >>
>> >> >
>> >> > Andrea, Benoit, what is the status of this fork ? As I have write
>> >> > permission, I've merged with mainline once or twice, but the fork
>> >> > seems
>> >> > abandoned... was there a use case ? not anymore ?
>> >>
>> >> I haven't touched it in a long time. The original use case was small
>> >> matrices, where it is faster than householder. Now you mention large
>> >> matrices that are not fully known in advance: interesting..
>> >>
>> >> Finally, about the old debate about whether to store the list of
>> >> givens permutations, let me mention that in the different setting of
>> >> SVD, I have now decide to try this approach too. It's different as I'm
>> >> only storing N Givens rotations per sweep, as opposed to N^2/2 in
>> >> GivensQR, but perhaps still worth mentioning.
>> >
>> > Let's revive this thread a little bit :)
>> >
>> > I'm curious to have a general idea of the solution method you want to
>> > implement here (SVD based on Givens rotations, storing only N
>> > rotations/sweep).
>>
>> There was a "little problem" with the above: Givens SVD actually
>> needed N^2 Givens rotations per sweep, not N.
>>
>> So instead, I'm now looking at divide-and-conquer SVD.
>>
>> This is the "modern" approach anyway. It's parallelizable, which
>> Givens SVD isn't.
>>
>> In LAPACK terms, this is SGESDD, as opposed to SGESVD.
>>
>> The big reason for me to look at divide-and-conquer SVD is that it
>> requires very little storage. Here we're starting from a bidiagonal
>> matrix. In order to reduce the problem of size N into two sub-problems
>> of size N/2, if I understand well, I need to store 2N scalars. So I
>> expect the whole thing to use O(N) storage. Of course, if one starts
>> from a general rectangular matrix, one has to bidiagonalize it first,
>> but that's done already in Eigen, see UpperBidiagonalization, using
>> householder transforms.
>>
>> Divide-and-conquer is only applicable to narrow enough band matrices.
>> For example, it can't be used to compute the QR of a general matrix,
>> or the eigendecomposition of a Hessenberg matrix.
>>
>> > Is there a paper that you could refer me to?
>>
>> Yes, this one is very good:
>>
>> www.cs.colorado.edu/~jessup/SUBPAGES/PS/sorensen.ps.gz
>>
>> equivalently:
>>
>> http://portal.acm.org/citation.cfm?id=181618
>
> Thanks for the pointers, will check the paper out.
>>
>> > One sweet
>> > thing of using Givens rotations (as opposed to say Householder
>> > reflections)
>> > is that their computation can be easily parallelized across multiple
>> > cores.
>>
>> ...depends on the order in which the Givens rotations are applied,
>> since parallelization is only possible for mutually commuting (AB=BA)
>> transformation --- otherwise you get different results depending on
>> which one is applied first.
>>
>> For example, Givens SVD is not parallelizable at all.
>>
>> Givens QR, yes, it's parallelizable.
>>
>> > On a separate note, I'd like to ask if there would be interest in having
>> > a
>> > SVD implementation tailored for the case of using one decomposition
>> > instance
>> > to solve multiple "similar" problems. By similar I'm referring to solve
>> > initially for matrix A, and then for a perturbed matrix A + deltaA, and
>> > so
>> > on. In such cases, results from previous computations can greatly
>> > increase
>> > performance. The KDL developers [1] brought to my attention a paper that
>> > details this solution strategy (also based on Givens rotations), which
>> > can
>> > be found here [2]. They have also a working implementation [3] that uses
>> > eigen matrices, but its internals are not "eigenized".
>>
>> That sounds interesting for certain classes of problems. Though it can
>> only work if certain assumptions on the matrix A are made. At least,
>> it should sufficiently many distinct singular values. Otherwise, bad
>> phenomena can happen. For example, let A = Id be the identity matrix.
>> Then a SVD of A is just: A = Id x Id x Id. Now take a small
>> perturbation of A: it could have absolutely any singular vectors on
>> both sides --- just because A is close to Id doesn't imply anything on
>> its singular vectors. So the knowledge that A is close to Identity,
>> doesn't help at all.
>
> I don't know if I followed you here. I'm going to paraphrase the main idea
> of the paper to make sure we're in the same page. The SVD referenced above
> uses Givens rotations to map A to an orthogonal matrix B like so,
>
> AV = B
>
> where V contains the sequence of Givens rotations.

Assuming that instead of "an orthogonal matrix B" you meant "a matrix
B whose columns are mutually orthogonal".

Yes, you are right. Forgive my incorrect wording.

> B can be then decomposed
> into B=UD, where U is an orthonormal matrix

OK sure, just let the diagonal coeffs in D be the norms of the columns of B.

> and D is a diagonal matrix
> containing the singular values, hence A =UDV'.

OK

> If we now solve for A_p = A + deltaA, we can use the previously calculated V
> as a starting point. The product A_p V will no longer be orthogonal,

Again it looks like by "orthogonal" you mean "the columns are mutually
orthogonal". Just FYI, there's an awkwardness in standard terminology:
by definition, orthogonal matrices are those whose columns are
mutually orthogonal AND have norm 1.

OK so far.

> but
> close (there are bounds on how much it may stray),

sure.

> so we only need to update
> V to re-orthogonalize A_p V,

If by this you mean that you can keep the same U, then that's wrong,
as I explained in my identity example.

> which should be less work than starting from
> scratch (V==identity). Further, there is no need to store individual Givens
> rotations, only the compound accumulated in V.

I disagree, so let me re-explain differently. Suppose that A =
Identity. Then A has infinitely many different SVD decompositions.
Indeed, letting D be again the Identity matrix, for any unitary matrix
U, we see that

A = U D U*

is a SVD decomposition of A.

Thus, in the case of A = Identity, a SVD decomposition's U matrix does
not contain any information.

Now the matter is that, very close to the Identity matrix, so very
close to A, we can find matrices all of whose singular values are
distinct. Just take a diagonal matrices with distinct, close-to-1
values on the diagonal. For example,

D = Diag ( 1, 1-epsilon, 1-2epsilon, 1-3epsilon, ... )

Now pick a unitary V of your choice. Since D is very close to
Identity, M = V D V* is also very close to Identity.

However the singular values of M are all distinct, so actually
M = V D V*
is the unique SVD decomposition of M (provided you request sorting of
the singular values).

So, in order to find the SVD of M, we must find the unitary V. And M
is very close to A (which is Identity). But a SVD of A doesn't help at
all guessing V.

***

I'm not saying that your approach is bad. I understand it can work
great as soon as the singular values are all distinct (to be precise,
the smallest distance between two singular values should be at least
twice larger than your perturbation). That is the generic case. And in
the case where there exists a repeated singular value, a SVD is
harder to compute anyway.

Benoit

>
> Adolfo
>
>>
>> But yes, that sounds interesting in principle, we'll see about that
>> post-3.0.
>>
>> >
>> > Finally, I wanted to ask the more general question of what SVD
>> > implementations will likely make it to 3.0.
>>
>> I am currently writing the new divide-and-conquer SVD. I would like it
>> to replace the current SVD, but of course i'll only do that once it's
>> clear that it's at least at good.
>>
>> I also want to keep JacobiSVD, as it is the only SVD avoiding
>> altogether householder bidiagonalization, which is not perfectly
>> numerically stable.
>>
>> I know I have been talking about that forever, but now I'm really
>> doing it, and i'm starting at mozilla in 2 weeks anyway so it must be
>> done before.
>>
>> Benoit
>> >
>> > TIA,
>> >
>> > Adolfo
>> >
>> > [1] http://orocos.org/kdl
>> > [2] http://www.engr.colostate.edu/~aam/pdf/journals/5.pdf
>> > [3]
>> >
>> > http://svn.mech.kuleuven.be/websvn/orocos/trunk/kdl/src/utilities/svd_eigen_Macie.hpp
>> >
>> >>
>> >> > btw, Am I the only one needing givens QR decompozition on the list ?
>> >> > anyone ?
>> >>
>> >> It is for sure needed at least by everyone needing QR decomposition of
>> >> very small matrices.
>> >>
>> >> If you need to make intrusive changes in Andrea's fork, perhaps it's
>> >> easiest for you to fork it, you can then re-merge with Andrea if he
>> >> expresses agreement with your changes. I personally don't have time
>> >> right now to think about that... :(
>> >>
>> >> Benoit
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Adolfo Rodríguez Tsouroukdissian, Ph. D.
>> >
>> > Robotics engineer
>> > PAL ROBOTICS S.L
>> > http://www.pal-robotics.com
>> > Tel. +34.93.414.53.47
>> > Fax.+34.93.209.11.09
>> > AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
>> > pueden
>> > contener información privilegiada y/o confidencial que está dirigida
>> > exclusivamente a su destinatario. Si usted recibe este mensaje y no es
>> > el
>> > destinatario indicado, o el empleado encargado de su entrega a dicha
>> > persona, por favor, notifíquelo inmediatamente y remita el mensaje
>> > original
>> > a la dirección de correo electrónico indicada. Cualquier copia, uso o
>> > distribución no autorizados de esta comunicación queda estrictamente
>> > prohibida.
>> >
>> > CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s) may
>> > contain confidential information which is privileged and intended only
>> > for
>> > the individual or entity to whom they are addressed. If you are not the
>> > intended recipient, you are hereby notified that any disclosure,
>> > copying,
>> > distribution or use of this e-mail and/or accompanying document(s) is
>> > strictly prohibited. If you have received this e-mail in error, please
>> > immediately notify the sender at the above e-mail address.
>> >
>>
>>
>