Re: [eigen] notes on Householder and Givens |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] notes on Householder and Givens*From*: Keir Mierle <mierle@xxxxxxxxx>*Date*: Wed, 4 Feb 2009 09:45:04 -0800*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=IDWlCcqBu4J9iCVoesMUx8hqL3qgzi9BihMPRb6OPrs=; b=L/nFQneh91OTFa91JlmhfVW/UMFNmzcmAql+04OiwDdDvc560DkUJKmIh8Mvhxs60H do0Sz9WhAA4CF/d7ZJ9zih1wrMfCvs2TAzxMlLdv6kt2B7//zuLQH7w5FbsvqIdfOT7E GUAj0lrFDFdKF3A26g+bvyKtZRmOt0nGc2YNw=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=ubakybMW7W1UVz1t/rVVbijjtn1/Lwfu+Gn1VjoddcLAMVle8RZwG/7uQmQJPwKEAy QUUA/mqN/jIaDWl3JFK3XZMqvYR4QB3qpDlb5cXblPVDKWEVfory72RikfWTi8lStLB1 fvrUi5bDf3jm7G2DZG98vBjh2ksarOXkj+wuA=

Cool, thanks for taking the time to write this up!

On Wed, Feb 4, 2009 at 7:04 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:

So if x is close to -alpha*e_0, won't catastrophic cancellation occur? Isn't this exactly the case that we'll be producing by doing the pivoting below?

On Wed, Feb 4, 2009 at 7:04 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:

Hi,

This is mostly for Keir.

A) Computing the natural Householder vector

Corresponds to page 209 in Golub & van Loan.

At this stage we don't normalize the 1st coordinate to be 1, so we

don't have a beta. Let's just write down the general formula for the

Householder vector.

The input data is a vector x in C^n. (The space of n-vectors of

complex numbers).

The assumption is that x_0 != 0 (the 0th coordinate of x).

Here we are not going to try to check if x_0 != 0; rather, it is the

responsibility of the user to ensure that x_0 is as big as possible

(in absolute value) for optimal numerical stability.

The output must be a vector v in C^n such that the following

/Householder reflection/ P, defined by

P = Identity - 2/(v^* v) v v^*

satisfies the condition that Px is a multiple of e_0, i.e. only its

0th coordinate may be nonzero.

Here is a good formula giving such a vector v:

v = x + ( x_0 / abs(x_0) ) norm(x) e_0

So if x is close to -alpha*e_0, won't catastrophic cancellation occur? Isn't this exactly the case that we'll be producing by doing the pivoting below?

Notice that here, ( x_0 / abs(x_0) ) is just the sign of x_0 in case

it is a real number; but it generalizes well to complex. norm(x) is

just the usual L2 norm.

Notive that v can be replaced by lambda * v, makes no difference.

We want the 0th coordinate of v to be as big as possible (in abs

value), that's why we choose ( x_0 / abs(x_0) ) here.

B) Computing the "optimized" Householder vector

Corresponds to 5.1.3 page 210 in Golub

an optimized householder vector is a struct like that:

struct Householder

{

Scalar beta;

Vector essential (size n-1);

};

The number beta won't be stored in the matrix, it's just used once and

then forgotten (so this struct householder is not going to be used for

long-term storage).

The vector essential is the n-1 last coeffs of v/v0 (v normalized such

that its 0th coeff is 1).

So putting it all together our householder function could look like

this (needs to be optimized!):

void ei_compute_householder(const Vector& x of size n, Householder *h)

{

// compute natural householder vector v

Vector v = x;

v0 += ( x_0 / abs(x_0) ) norm(x);

Vector v_starting_with_1 = v/v0;

h->essential = v_starting_with_1.end(n-1);

h->beta = 2 / v_starting_with_1.squaredNorm();

};

This function assumes that x0 is "nonzero". It's the responsibility of

the user to only pass vectors x that are nonzero and whose 0th coord

is not negligible compared to the rest.

Notice that our function works then without any branching, contrary to

Golub & van Loan's one.

Yes again it needs to be optimized to avoid redundant array traversals.

C) bidiagonalization

The model here is Golub & van Loan algorithm 5.4.2.

But contrary to it, we are going to do pivoting to ensure that we only

pass vectors x to ei_compute_householder that have x0 as big as

possible. That will guarantee optimal numerical stability.

Look at the first householder step in bidiagonalization: we want to

zero out the 0th column, except for its 0th coefficient. In this step,

before computing the householder vector, we want the (0,0)-th matrix

coeff to be as big as possible.

There are two cases:

0) compute once and for all the max abs value of coefficients of the

matrix. Call that "biggest".

This only needs to be done once for the whole matrix bidiagonalization.

1) Find the row i of the biggest coefficient in the 0-th column.

2) if that coefficient is muchSmallerThan(biggest), then there's

nothing to do for this column. Skip it.

3) if that coefficient is not muchSmallerThan(biggest), then swap

row(i) with row(0). Keep track of that permutation (called a

transposition because it's just swapping 2 and leaving the rest fixed)

in an array of ints. At the end of the bidiagonalization, accumulate

the transpositions into another array of ints, permutationP, allowing

to look up directly where a given row has landed. You find a model of

this in LU.h where we do the same.

Notice that row/column permutations are unitary operations. I.e. they

amount to multiply on the left, respectively on the right, by

permutation matrices, which are unitary. Hence they are allowed in

bidiagonalization, tridiagonalization, etc.

So now that you're done with the 0th column, move on: 0th row, 1th

col, 1 th row, 2nd col, etc...

The reason why one produces bidiagonal form and can't right away

produce the SVD, is that when you do the 0th row, you don't want to

disturb the 0th column that you just did. So the unitary operation

that you do must nor affect the 0th col, so this householder must

concentrate on the (0,1)-th coeff, not the (0,0)-th coeff.

D) Givens operations.

Corresponds to 5.1.3 page 216 in Golub & van Loan.

Here, the authors say that they want to take special care of the case

where the floating point numbers at hand at so big (or small) that one

can't square them without causing overflow (or underflow).

I claim that this care is useless. My main argument is that if numbers

are that close to the maximum order of magnitude, there are many

operations that inherently fail on them. Basically: all products. I'm

not interested in making costly contortions to make a few operations

succeed on such numbers, when most operations are bound to fail. In

practice, even if we did that, our users would still be forced to go

for bigger floating point types for other reasons.

What are we talking about anyway?

For float the max is 1e+30 so we're talking about 1e+15. Users who

want these orders of magnitude should use double. The speed advantage

of float is taken away by the need to take all this special care.

For double, the max is 1e+300 so we're taking about 1e+150. I doubt

anybody uses such orders of magnitude. If they do get above 1e+150,

then they're probably going to need also more than 1e+300 hence they

need multiple precision numbers anyway. The point is that from 150 to

300 there is only a very small difference ... once you start looking

at this on a log scale. (Yes so this is log log of the actual numbers,

so what).

That being said, we can go for a much simpler Givens:

Given a NONZERO vector in C^2

(a)

(b)

A 2x2 complex unitary matrix zeroing the b-coordinate of that vector is given by

(conj(a)/r conj(b)/r)

(-b/r a/r)

where r = sqrt(abs(a)^2 + abs(b)^2).

At first glance i'm in favor of trying a givens function that just

assumes the input vector (a,b) to be nonzero.

If needed one can make it check for that, but in order to do that it

would need context: what reference order of magnitude to use in order

to determine what's "zero" ?

For that reason I would first try a stupid version checking nothing

and have the calling code (the diagonalization / final SVD method) do

the checking upstream. I might be wrong, haven't looked very closely.

E) the final step: SVD from bidiagonal, or diagonalization from tridiagonal.

You're on your own :) i haven't looked closely at that.

Cheers,

Benoit

**Follow-Ups**:**Re: [eigen] notes on Householder and Givens***From:*Benoit Jacob

**References**:**[eigen] notes on Householder and Givens***From:*Benoit Jacob

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] External contributions** - Next by Date:
**Re: [eigen] notes on Householder and Givens** - Previous by thread:
**[eigen] notes on Householder and Givens** - Next by thread:
**Re: [eigen] notes on Householder and Givens**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |