Re: [eigen] on fuzzy comparisons

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] on fuzzy comparisons
From: Benoît Jacob <jacob@xxxxxxxxxxxxxxx>
Date: Thu, 5 Jun 2008 22:03:03 +0200
Cc: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx>

On Thursday 05 June 2008 19:24:15 Gael Guennebaud wrote:
> Hi,
>
> thanks a lot for this very detailed answer. Yes, the rotational invariant
> argument convinced me about the L2 norm.

Hi,

glad that you liked this argument. But apparently it didn't convince you very 
deeply, because at the end, you write:

> I still cannot understand why:
>
> [a, a, a] << b
>
> and
>
> [a, ........, a] << b
>
> (with [a, ........, a] very large and a and b two scalars)
>
> could give two different answers. To be clear: I mean that IMO they should
> return the same result, unlike the current isMuchSmallerThan.

Let me give a very striking example (I hope).

For any vector v, there exists a rotation sendings the vector v to the vector
[ v.norm() , 0, 0, .... , 0 ]

In particular, taking v = [ a, ...., a ]  , with n times a,
there exists a rotation sending [ a, ...., a ] to the vector
[ abs(a) * sqrt(n), 0, 0, ... , 0 ]

So if you do a comparison based on the coeffs and want it to be 
rotation-invariant, basically you are stuck because for [ a, ...., a ] the 
biggest coeff is abs(a) and for the other it's abs(a)*sqrt(n). Since sqrt(n) 
gets arbitrarily large, any comparison based on the biggest coeff becomes 
meaningless for any precision level, for large enough n.

> BTW, by "size" I meant the number of entries and not the length or norm or
> whatever else.

Sure -- I understood that. I should have introduced another word than "size", 
maybe "bigness" (!!) to denote that notion of how-big-a-vector-is that we are 
discussing here, and which I say is just the norm.

>
> Maybe my problem comes from the semantic of isApprox and isMuchSmallerThan.
>
> From my point of view, isApprox(m1,m2) should return m1==m2 if there was no
> numerical error at all. Because of *numerical issues*, we have to introduce
> a fuzzy comparison with some eps. Same for isMuchSmallerThan which,
> ideally, could simply be: m1==0. Since they are related to numerical issue
> I don't see why a n x m matrix could not be considered as a n x m vector ?
>
> What about the Frobenius norm which simply considers the matrix as a 1D
> vector and compute the L2 norm ? It is also invariant under rotation and
> very cheap.

This is a very good remark.

Let me first explain why matrixNorm remains the ideal that we want to tend to.

To say that m.matrixNorm() << a means exactly that for all vector v,
m * v << a * v.   (where comparison of vectors is done with norm()).

This time this is really a theorem, so using matrixNorm brings the huge 
advantage of this compatibility between << for matrices and << for vectors.

Note in particular that matrixNorm degenerates to just norm in the case of a 
1-column or 1-row matrix.

Likewise, this would mean that the isApprox for matrices is well compatible 
with the isApprox of vectors, i.e.
if m1.isApprox(m2) and v.norm() <= 1 then (m1*v).isApprox(m2*v).

And as I said matrixNorm has rotation invariance, adjoint invariance, 
transpose invariance, ...

So you see, using norm for vectors and matrixNorm for matrices has unmatched 
compatibility advantages. I could go on and on.

So, i hope this proves that matrixNorm is really what we want to tend to.

Now you might recall this theorem: in a finite-dimensional space, all norms 
are equivalent.

So for n x n - matrices, there exist constants 0<K<L<infinity such that

K * L2norm  <= matrixNorm <= L * L2norm 

Then what you noticed about it all being just a concern about numerical 
stability, was that these two inequalities mean that matrixNorm is close to 
zero exactly when L2norm is close to zero. That's right!

So if I understood well what you proposed was to do the computation for L2norm 
and then take into account K and L.

However (there must exist formulas...) when n tends to infinity, K tends to 
zero and L tends to infinity.

This means that computing L2norm gives less and less precise information 
regarding matrixNorm. For any given wanted precision level, for large enough 
n, computing L2norm becomes meaningless for our purpose (where we want to 
know about the matrixNorm).

You might say why do i insist so much on matrixNorm and not L2norm, if yes, 
keep in mind the example that i gave at the beginning of this mail. It was 
for vectors but it also applies to matrices with 1 column since matrixNorm is 
just norm in that case.

I showed that norm, not L2norm, was the good notion. So in all this business 
with K and L, we are just trying to approximate the matrixNorm with L2norm, 
but matrixNorm remains what we really want. Of L/K becomes arbitrarily big, 
our approximation becomes arbitrarily imprecise.

In short, the L2norm may be a good approach for small matrices, that is, 
whenever n is small enough that L/K is of the order of magnitude of 1.

Since that leaves open the question what to do else, I don't think it's such a 
good solution :)

The current solution is not very good either but it has some more firm 
geometrical meaning as I explained in prev mail;

of course the only good solution remains matrixNorm.

Cheers,

Benoit

>
> An example of the current issue:
>
>
> Another example:
>
> [ 0.012  0.023 ]      <<    [0.012    2.76e6]
> [ 0.014  0.098 ]              [0.0139  8.87e7]
>
> currently returns false and true if you transpose. It is clear than in this
> example we are actually doing:
>
> [ 0  0 ]      <<    [0  2.76e6]
> [ 0  0 ]              [0  8.87e7]
>
> and the correct answer should be true ?
>
>
>
> Now let's talks about (m1 << b) with b scalar. An example: imagine I have a
> matrix m1, and I want to check if it is zero. Since I know the order of the
> values I manipulate is around 1 and 10, checking if (m << 1) looks
> reasonable ? and to be honest I really don't understand why I should write
> (m << m.constant(1)) rather than (m << 1), or the inverse, I don't know....
>
> I still cannot understand why:
>
> [a, a, a] << b
>
> and
>
> [a, ........, a] << b
>
> (with [a, ........, a] very large and a and b two scalars)
>
> could give two different answers. To be clear: I mean that IMO they should
> return the same result, unlike the current isMuchSmallerThan.
>
> Then,  why (m << a) could not be defined as :
>
> m.frobeniusNorm() < eps * sqrt(m.cols()*m.rows()) *  a
>
> ??
>
> BTW, by "size" I meant the number of entries and not the length or norm or
> whatever else.
>
>
> Gael.

Attachment: signature.asc
Description: This is a digitally signed message part.

Follow-Ups:
- Re: [eigen] on fuzzy comparisons
  - From: Benoît Jacob

References:
- [eigen] on fuzzy comparisons
  - From: Gael Guennebaud
- Re: [eigen] on fuzzy comparisons
  - From: Benoît Jacob
- Re: [eigen] on fuzzy comparisons
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] on fuzzy comparisons
Next by Date: Re: [eigen] on fuzzy comparisons
Previous by thread: Re: [eigen] on fuzzy comparisons
Next by thread: Re: [eigen] on fuzzy comparisons

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/