[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen <eigen@xxxxxxxxxxxxxxxxxxx>
- Subject: [eigen] SVD
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Tue, 5 Jan 2010 08:23:07 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=mTwl97MSOS/HPFpOWshMS0p1Su7w+MFnlA4XBgqYE8I=; b=Jvy+Z6YVT3BvIEJR6R7gcJu6pqL0HEgVNQJuttT27T7QGgA3bmG1FlRCsmgmY2MPEz 5X4zN9gh38Bd2fxiBWv8OLQOnDrQ57DWJ59Ywa9syCORq1dkYd65IELuVmyTtHL94Dmo GDw4x13hf4Sya0lBD4wq+Xl4Mjg/XUq2i3vCE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=vP6opHFmGS8RIzgE899AUsPCjXsFC72U6g1VjAfV2X7D/8RVwZ3epDYizUukgzJyil iMGE2vZhW4lEbBau9aLdLLxwPsat25wqR1va+Z+JCe8fz3Q+V3NBK078DszO62i9ECCI G4kWMfkmFZ5yruX0XBkHjFBfMwnhFVKQtoZeA=
Hi,
this time i'm really doing the SVD :) I know it's not the first time
I've been saying that.
I would like to run my design choices through this list. I have had a
good look at LAPACK's api and code, it is a good inspiration but I am
not 100% satisfied about their design. Here are the areas where I plan
to do differently:
*Bidiagonalization*
LAPACK function: DGEBDR, http://www.netlib.org/lapack/double/dgebrd.f
They reduce to either upper or lower bidiagonal form depending on
whether M>N. I am not convinced about this and would like to
standardize on upper bidiagonalization, restricting to rows>=cols, for
these reasons:
- it makes my life easier, the code simpler and a bit faster.
- that's all i need to implement SVD for all cases (see next section)
- people dont care to much about bidiagonalization for itself, do
they? its only a step towards SVD, as I see it.
*General SVD approach*
Golub-Reinsch SVD is up to 50% faster for approximately square
matrices (aspect ratio < 5/3), while R-SVD is up to 2x faster for more
rectangular matrices.
Let's implement this:
1) if the aspect ration > 5/3, do a R-SVD, which means do a QR
decomposition and go to 2) to compute the SVD of R
2) else if rows>=cols, do Golub-Reinsch using the UpperBidiagonalization
3) else if rows<cols, do like 2) on the transpose.
So the only inefficiency is in evaluating the transpose in case 3),
which is not too bad I think.
Automatically selecting between Golub-Reinsch and SVD is good, I
think, as it is an implementation detail and few users would know
about this; and it doesn't cost much as most of the code is shared
(HouseholderQR doesn't cost much as it uses the same Householder code
as Bidiagonalization).
*Storage of the unitaries*
The question is how can the SVD class allow to do:
svd.u() * matrix
efficiently without computing U itself. If we do this, then in
particular we're able to compute selected singular vectors
efficiently, but we can also do much more, like efficiently
implementing least squares. Same for V.
As you know, in Householder-based decompositions, it's very convenient
to store the householder vectors and subsequently be able to apply the
transformation without forming the unitary matrix explicitly. But in
SVD and in eigensolvers, there also are Givens rotations, the number
of which cannot be predicted. So LAPACK gives up allowing to apply the
transformations efficiently. There is one exception: in SVD
decomposition, A = UDV, they allow to efficiently apply the transpose
of U on the fly to a matrix. This is useful to implement least
squares. But this is very ad-hoc, doesn't help in other cases (how to
apply U itself? V? in their LS implementation they have to form the V
matrix explicitly) and generates code that may be unused.
So here is an idea that is in some sense coming from Andrea's
GivensQR: let's store the givens rotations themselves. Each sweep
produces an array of N Givens rotations. What is unknown is how many
sweeps there are, so they could be stored in a linked list. What is
nontrivial then is to apply these Givens rotations efficiently, as
each sweep implies a full traversal of the matrix to which one applies
the transformation. Here the simplest solution is to apply the
transformations to only a few columns (e.g. 16 columns) of the matrix
at once, since columns are mutually independent in this problem.
Another approach is divide-and-conquer bidiagonal SVD, it should also
be inherently cache-friendly, but is much more complicated to
implement and the complexity of this step is negligible compared to
the O(n^3) bidiagonalization step so I'm not sure if it's worth it.
Benoit