Re: [eigen] on fuzzy comparisons
• To: eigen@xxxxxxxxxxxxxxxxxxx
• Subject: Re: [eigen] on fuzzy comparisons
• From: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx>
• Date: Thu, 5 Jun 2008 19:24:15 +0200

Hi,

thanks a lot for this very detailed answer. Yes, the rotational invariant argument convinced me about the L2 norm.

Maybe my problem comes from the semantic of isApprox and isMuchSmallerThan.

From my point of view, isApprox(m1,m2) should return m1==m2 if there was no numerical error at all. Because of *numerical issues*, we have to introduce a fuzzy comparison with some eps. Same for isMuchSmallerThan which, ideally, could simply be: m1==0. Since they are related to numerical issue I don't see why a n x m matrix could not be considered as a n x m vector ?

What about the Frobenius norm which simply considers the matrix as a 1D vector and compute the L2 norm ? It is also invariant under rotation and very cheap.

An example of the current issue:

Another example:

[ 0.012  0.023 ]      <<    [0.012    2.76e6]
[ 0.014  0.098 ]              [0.0139  8.87e7]

currently returns false and true if you transpose. It is clear than in this example we are actually doing:

[ 0  0 ]      <<    [0  2.76e6]
[ 0  0 ]              [0  8.87e7]

and the correct answer should be true ?

Now let's talks about (m1 << b) with b scalar. An example: imagine I have a matrix m1, and I want to check if it is zero. Since I know the order of the values I manipulate is around 1 and 10, checking if (m << 1) looks reasonable ? and to be honest I really don't understand why I should write (m << m.constant(1)) rather than (m << 1), or the inverse, I don't know....

I still cannot understand why:

[a, a, a] << b

and

[a, ........, a] << b

(with [a, ........, a] very large and a and b two scalars)

could give two different answers. To be clear: I mean that IMO they should return the same result, unlike the current isMuchSmallerThan.

Then,  why (m << a) could not be defined as :

m.frobeniusNorm() < eps * sqrt(m.cols()*m.rows()) *  a

??

BTW, by "size" I meant the number of entries and not the length or norm or whatever else.

Gael.

On Thu, Jun 5, 2008 at 6:04 PM, Benoît Jacob <jacob@xxxxxxxxxxxxxxx> wrote:
Hi,

Executive summary:
- for vectors, the current approach is correct, and i can justify it.
- for matrices, the good solution would be extremely costly so the current
solution is a compromise and I am open to suggestions if you think another
compromise is better.

*** Vectors ***

There is absolutely no reason whatsoever why
(v << [a,....,a] ) should be equivalent to (v << a).

As n increases, the vector [a,...,a] with a repeated n times, gets bigger and
bigger. Its norm, which is equal to abs(a)*sqrt(n), gets arbitrarily large.

When you imply that (v << [a,....,a] ) should be equivalent to (v << a),
implicitly you seem to believe that [a,...,a] has intuitively the "same size"
regardless of n. But that is not at all the case.

To make this any precise, we need to agree on what we call the "size" of a
vector. I'm saying that the only good such notion is the vector norm. You
were proposing to look at the sizes (absolute values) of individual
coefficients in the vector, and I'm saying that's a wrong approach! Indeed, I
suppose that we agree that rotating a vector does not change its size. In
general n-space, "rotating" means "apply a unitary matrix" where unitary
a.k.a. orthogonal means a matrix u such that uu*=u*u=1. The vector norm is
invariant under such a "rotation": whenever u is unitary and v is a vector,
norm(uv)=norm(v). But the absolute values of the coeffs (or the max of them)
are not at all invariant under such "rotations" !

I hope to have convinced you that:
(v << [a,....,a] ) should be equivalent to
(v << norm([a,....,a])=sqrt(n)*abs(a)).

And more generally that:
v << w should be equivalent to v << norm(w)
which should be equivalent to norm(v) << norm(w).

*** Matrices ***

Let's state right away what the real good notion of size is for a matrix is:
it's matrixNorm(). So theoretically,

m1 << m2 should be equivalent to m1.matrixNorm() << m2.matrixNorm()
and
m << x should be equivalent to m.matrixNorm() << abs(x)  for any scalar x.

The matrix norm has all the good properties you expect; m and umu* have the
same norm for any unitary u so it's invariant under rotation; also m and m*
have the same norm.

Unfortunately... this is very costly to compute and depends on the QR module!

If you think that we don't need matrix comparisons in the Core module and that
it's OK to have them being quite costly, then if you like we can move that
stuff to QR and use matrixNorm(). In fact, AFAICR only the unit-tests use
matrix comparisons and it's quite OK to have them depend on QR.

So yes the current solution is suboptimal; but at least it retains some
mathematical properties.

What the current solution means, is that:

*****************************************************************
m1 << m2 if and only if we have m1*v << m2*v for all vectors v
*****************************************************************

Sounds reasonable doesn't it? I am very attached to the fact that this
relation has such a simple geometrical meaning. The implemented algorithm
amounts to check that for v running over the canonical basis, and this is
equivalent to checking it for all vectors v.

It is inspired by the so-called "strong topology" on operators.

If really it is important to you that the matrix comparisons behave well with
respect to taking adjoints, then what you're after is called the "*-strong
topology" and concretely this better relation (let's denote it by <<<) is
just defined by:

m1 <<< m2 if and only if m1 << m2 and m1* << m2*.

transposes; it is just twice slower unless marked<SelfAdjoint>().

There remains the possibility, evocated above, of using matrixNorm() to get
the best possible notion of matrix comparisons.

Cheers,

Benoit

> Things get weirder when we compare matrices: the comparison is done per
> column and each column has to satisfy the above criterion.
> Then it is very easy to build example where:
>
> m << a   !=    m' << a
>
> (where ' denotes the transpose)
>
> I understand such a definition comes from math when you interpret a matrix
> as an operator, but here I think they should be considered like big 1D
> vectors ?
>
> Actually a related question is whether or not all the vector/matrix entries
> are assumed to be of the same order or not. I guess their is no other
> choice to assume that, and if it's not the case then it's up to the user to
> do the comparisons per blocks...
>
> Then what about comparisons based on the max norm:
>
> v.isApprox(w)   ==  (v-w).cwiseAbs().maxCoeff() < eps *  min( v.order() ,
> w.order() );
>
> v << w == v.cwiseAbs().maxCoeff() < eps * w.order()
>
> v << a  ==  v.cwiseAbs().maxCoeff() < eps * |a|
>
> with the same definitions for vectors and matrices, and order could be
> defined like that:
>
> m.order() == m.cwiseAbs().maxCoeff();   // not very stable I guess
>
> m.order() == m.norm() / m.size();
>
> m.order() == m.cwiseAbs().sum()  / m.size();
>
>
> cheers,
> Gael.

 Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/