[eigen] notes on Householder and Givens |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: [eigen] notes on Householder and Givens*From*: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>*Date*: Wed, 4 Feb 2009 16:04:34 +0100*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=TKbX2GBUKgHRyHHpkoQMcSpoUF7M37fEhnqJ+lZpft8=; b=LuulLR6W0qKH9VNKBGbeMvEXEAcdTdMzdQH56n6wqyDpvFcIY95ih3HT3YvDLS0r3U Eoz0swkEHLO0t29XEFU/5dcIiFbiMWxdJA6eN/Ty+cz8W6+Tm6QORPoMQzHn6+sIU5+I 35TePL22mLJ7P4Y91fUGmpIDYudPi9BJ5ol+M=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=nLTequXwXwuiX/QgTcyxruivN6GzEBzqt1DxpzUY2uzJ1k8J/rPHJ25blEhvp+K5Sf hK+6EspYXrBR2+kBkxh2/eIU06EDa2VCe66G9NrKJkQOpTQkU8a4rptDiURAkNcg7rUs q1IKbB6cyu4OFs7+Q2X91uOrPiiTtcFiVOjzo=

Hi, This is mostly for Keir. A) Computing the natural Householder vector Corresponds to page 209 in Golub & van Loan. At this stage we don't normalize the 1st coordinate to be 1, so we don't have a beta. Let's just write down the general formula for the Householder vector. The input data is a vector x in C^n. (The space of n-vectors of complex numbers). The assumption is that x_0 != 0 (the 0th coordinate of x). Here we are not going to try to check if x_0 != 0; rather, it is the responsibility of the user to ensure that x_0 is as big as possible (in absolute value) for optimal numerical stability. The output must be a vector v in C^n such that the following /Householder reflection/ P, defined by P = Identity - 2/(v^* v) v v^* satisfies the condition that Px is a multiple of e_0, i.e. only its 0th coordinate may be nonzero. Here is a good formula giving such a vector v: v = x + ( x_0 / abs(x_0) ) norm(x) e_0 Notice that here, ( x_0 / abs(x_0) ) is just the sign of x_0 in case it is a real number; but it generalizes well to complex. norm(x) is just the usual L2 norm. Notive that v can be replaced by lambda * v, makes no difference. We want the 0th coordinate of v to be as big as possible (in abs value), that's why we choose ( x_0 / abs(x_0) ) here. B) Computing the "optimized" Householder vector Corresponds to 5.1.3 page 210 in Golub an optimized householder vector is a struct like that: struct Householder { Scalar beta; Vector essential (size n-1); }; The number beta won't be stored in the matrix, it's just used once and then forgotten (so this struct householder is not going to be used for long-term storage). The vector essential is the n-1 last coeffs of v/v0 (v normalized such that its 0th coeff is 1). So putting it all together our householder function could look like this (needs to be optimized!): void ei_compute_householder(const Vector& x of size n, Householder *h) { // compute natural householder vector v Vector v = x; v0 += ( x_0 / abs(x_0) ) norm(x); Vector v_starting_with_1 = v/v0; h->essential = v_starting_with_1.end(n-1); h->beta = 2 / v_starting_with_1.squaredNorm(); }; This function assumes that x0 is "nonzero". It's the responsibility of the user to only pass vectors x that are nonzero and whose 0th coord is not negligible compared to the rest. Notice that our function works then without any branching, contrary to Golub & van Loan's one. Yes again it needs to be optimized to avoid redundant array traversals. C) bidiagonalization The model here is Golub & van Loan algorithm 5.4.2. But contrary to it, we are going to do pivoting to ensure that we only pass vectors x to ei_compute_householder that have x0 as big as possible. That will guarantee optimal numerical stability. Look at the first householder step in bidiagonalization: we want to zero out the 0th column, except for its 0th coefficient. In this step, before computing the householder vector, we want the (0,0)-th matrix coeff to be as big as possible. There are two cases: 0) compute once and for all the max abs value of coefficients of the matrix. Call that "biggest". This only needs to be done once for the whole matrix bidiagonalization. 1) Find the row i of the biggest coefficient in the 0-th column. 2) if that coefficient is muchSmallerThan(biggest), then there's nothing to do for this column. Skip it. 3) if that coefficient is not muchSmallerThan(biggest), then swap row(i) with row(0). Keep track of that permutation (called a transposition because it's just swapping 2 and leaving the rest fixed) in an array of ints. At the end of the bidiagonalization, accumulate the transpositions into another array of ints, permutationP, allowing to look up directly where a given row has landed. You find a model of this in LU.h where we do the same. Notice that row/column permutations are unitary operations. I.e. they amount to multiply on the left, respectively on the right, by permutation matrices, which are unitary. Hence they are allowed in bidiagonalization, tridiagonalization, etc. So now that you're done with the 0th column, move on: 0th row, 1th col, 1 th row, 2nd col, etc... The reason why one produces bidiagonal form and can't right away produce the SVD, is that when you do the 0th row, you don't want to disturb the 0th column that you just did. So the unitary operation that you do must nor affect the 0th col, so this householder must concentrate on the (0,1)-th coeff, not the (0,0)-th coeff. D) Givens operations. Corresponds to 5.1.3 page 216 in Golub & van Loan. Here, the authors say that they want to take special care of the case where the floating point numbers at hand at so big (or small) that one can't square them without causing overflow (or underflow). I claim that this care is useless. My main argument is that if numbers are that close to the maximum order of magnitude, there are many operations that inherently fail on them. Basically: all products. I'm not interested in making costly contortions to make a few operations succeed on such numbers, when most operations are bound to fail. In practice, even if we did that, our users would still be forced to go for bigger floating point types for other reasons. What are we talking about anyway? For float the max is 1e+30 so we're talking about 1e+15. Users who want these orders of magnitude should use double. The speed advantage of float is taken away by the need to take all this special care. For double, the max is 1e+300 so we're taking about 1e+150. I doubt anybody uses such orders of magnitude. If they do get above 1e+150, then they're probably going to need also more than 1e+300 hence they need multiple precision numbers anyway. The point is that from 150 to 300 there is only a very small difference ... once you start looking at this on a log scale. (Yes so this is log log of the actual numbers, so what). That being said, we can go for a much simpler Givens: Given a NONZERO vector in C^2 (a) (b) A 2x2 complex unitary matrix zeroing the b-coordinate of that vector is given by (conj(a)/r conj(b)/r) (-b/r a/r) where r = sqrt(abs(a)^2 + abs(b)^2). At first glance i'm in favor of trying a givens function that just assumes the input vector (a,b) to be nonzero. If needed one can make it check for that, but in order to do that it would need context: what reference order of magnitude to use in order to determine what's "zero" ? For that reason I would first try a stupid version checking nothing and have the calling code (the diagonalization / final SVD method) do the checking upstream. I might be wrong, haven't looked very closely. E) the final step: SVD from bidiagonal, or diagonalization from tridiagonal. You're on your own :) i haven't looked closely at that. Cheers, Benoit

**Follow-Ups**:**Re: [eigen] notes on Householder and Givens***From:*Keir Mierle

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] External contributions** - Next by Date:
**Re: [eigen] External contributions** - Previous by thread:
**Re: [eigen] Sparse API** - Next by thread:
**Re: [eigen] notes on Householder and Givens**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |