Re: [eigen] solve API
• To: eigen@xxxxxxxxxxxxxxxxxxx
• Subject: Re: [eigen] solve API
• From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
• Date: Mon, 28 Jun 2010 15:19:55 +0200
• Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=Piq00sfWU64LoCXkvFrQgp5aa70WjYGqlMrgXElBcfenSssWyjdsd0LVfzb3sQW6z+ vrX5QHTjwwcJrcBWg1vceS3emoOZCYaNZYol7Wml5wfC6aNKiTRAfnHmjHJT3YstL6E7 tUX/xiG/3+n50umnaKwoZtjhKLeeg6xOd1D+U=

```On Mon, Jun 28, 2010 at 2:35 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2010/6/28 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
>> Hi,
>>
>> I was thinking that we did not discussed the solved API for a while,
>> at so it would be perfect time to discuss it again ;)
>
> OK to add such new API, but let's keep the existing solve() API
> alongside, because
>  - as you mention below, computer programmers will be expecting such
> an imperative API;
>  - we're about to release beta1.

yes, I was thinking the same actually.

>
>> Recall that one goal is to be able to write code as you write math. So
>> here is the proposal:
>>
>> x = m.lu().inverse() * b;
>> x = b * m.lu().inverse();
>> x = m.lu().adjoint().inverse() * b;
>> x = b * m.lu().adjoint().inverse();
>
> I would much prefer:
>
> x = m.lu().inverse().adjoint() * b;
> x = b * m.lu().inverse().adjoint();

and this is simpler because m_lu().adjoint() would not make much sense.

> Other than that, I completely agree with your proposal (as long as the
> existing solve() API remains available) although I would like to
> discuss the name of this method (see below).
>
> Your proposal is Eigen-ish because it is object-oriented. The API only
> returns an object, doesn't modify existing objects. I think the most
> object-oriented API is one where all methods are const-qualified
> except for operator=.
>
>>
>> .inverse() and .adjoint() would return simple proxy with operator *
>> overloaded to return a "solve" proxy that we already have (so
>> implementation-wise that's not much).
>>
>> I know that computer scientists used to other linear algebra packages
>> might be shocked by such an API and will look forever for a function
>> having "solve" in its name, but this is pure Eigen style IMO and it is
>> consistent with the rest of the API (diagonal, permutation,
>> transposition already use this API).
>
> Yes, keeping the existing imperative solve() API is very important too.
>
> The two can coexist, the only potential conflict that I see is in LU
> classes that already have a inverse() method, we have to make sure
> (say for beta2) that this doesn't break existing behavior (it's ok to
> break some corner cases until 3.0 is released).
>
>> It remains the problem of convenient "inplace" solving. Currently, when doing:
>
> What do you think about:
>
>    m.lu().inverse().applyOnTheLeft(x);
>    m.lu().inverse().applyOnTheRight(x);
>
> as we already do in Householder?

yes, but there is still the const reference issue.... oh wait, this
API is also very confusing. For Householder it is like this:

Matrix A;
HouseholderSequence H;

A.applyOnTheRight(H);

<=>

A = A * H;

So here it would be:

B.applyOnTheRight(A.lu().inverse());

This solve the const reference issue, but this now looks very strange to me..

I remember the rationale was to mimic operator*=:

A.applyOnTheRight(H); <=> A *= H;
A.applyOnTheLeft(H);   <=> no equivalent whence the introduction of
these functions.

But if both of us got confused... that really means there is something
wrong here.

> Finally, let's discuss another issue: for decompositions that allow
> non-invertible matrices, such as FullPivLU, calling this method
> inverse() is a bit problematic. So here's a proposal. Since the
> imperative API is:
>    m.lu().solve(x);
> how about calling this method solve() so that the new API is really
> very consistent with the old one:
>    m.lu().solve() * x;
> Now you might object that since this method just returns an object, it
> should be named not as a verb, but as a noun. So how about solver() ?
>    m.lu().solver() * x;
>
>
> Either way (calling it solve() or solver()) we also solve the problem
> of the potential conflict with existing inverse() methods (yes I know
> these inverse() methods already return proxies, but those
> ReturnByValue proxies offer much more API than we have to here, since
> they are real expressions).

I really prefer inverse() because it is very important for non linear
solvers to have a consistent API such that one can write a generic
function and apply the inverse of a given argument without knowing its
type (could be a diagonal, a permutation, the identity matrix, a LLT
factorization, etc.)

Also mat = whateverdec.inverse(); should work. So no issue with the
current inverse() methods.

For rectangular matrices, inverse still make sense as the pseudo
inverse. The weird thing is that the returned Inverse expression would
become the left pseudo inverse or right pseudo inverse at the moment
it is applied to a matrix:

Matrix m(r,c);

x = m.svd().inverse() * m; // left pseudo inverse, x = identity(c,c);
x = m * m.svd().inverse(); // right pseudo inverse, x = identity(r,r);

meaning that:

mat = svd.inverse();

would be undefined (illegal), but I think that's fine.

gael

>
> Cheers,
> Benoit
>
>> v = m.lu().solve(v);
>>
>> we can detect that we have the same object for the "x" and the "b" via
>> one runtime pointer comparison, and then skip the matrix copy of b to
>> x before the inplace solving path. So the same mechanism would still
>> work:
>>
>> b = b * m.lu().adjoint().inverse();
>>
>> => inplace solving, no temporary, no matrix copy (except for the
>> decomposition itself but that's another story).
>>
>> The pb, is when "b" is a complex expression:
>>
>> b.block(i+s, j+block_size-1, rows-r, std::min(blocksize,cols))
>> =m.lu().inverse() * b.block(i+s, j+block_size-1, rows-r,
>> std::min(blocksize,cols));
>>
>> You don't want to write it twice!!
>>
>> Here a solveInPlace() function could make sense:
>>
>> m.lu().solveInPlace(b.block(i+s, j+block_size-1, rows-r,
>> std::min(blocksize,cols)));
>>
>> better ? not so sure:
>> 1) what about "ontheright" and transposed variants ?
>> 2) and even worse, this does not work because block(...) returns a
>> temporary object which has to be passed by const reference, so you
>> have to name it anyway:
>>
>> Block<...> b_block(n,i+s, j+block_size-1, rows-r, std::min(blocksize,cols));
>>
>> and then repeating twice b_block is fine.
>>
>> As some of you probably guessed, I did not slept well this night....
>>
>> cheers,
>>
>> gael
>>
>>
>>
>
>
>

```

 Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/