Re: [eigen] solve API
• To: eigen@xxxxxxxxxxxxxxxxxxx
• Subject: Re: [eigen] solve API
• From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
• Date: Mon, 28 Jun 2010 08:35:18 -0400

2010/6/28 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> Hi,
>
> I was thinking that we did not discussed the solved API for a while,
> at so it would be perfect time to discuss it again ;)

OK to add such new API, but let's keep the existing solve() API
alongside, because
- as you mention below, computer programmers will be expecting such
an imperative API;
- we're about to release beta1.

> Recall that one goal is to be able to write code as you write math. So
> here is the proposal:
>
> x = m.lu().inverse() * b;
> x = b * m.lu().inverse();
> x = m.lu().adjoint().inverse() * b;
> x = b * m.lu().adjoint().inverse();

I would much prefer:

Other than that, I completely agree with your proposal (as long as the
existing solve() API remains available) although I would like to
discuss the name of this method (see below).

Your proposal is Eigen-ish because it is object-oriented. The API only
returns an object, doesn't modify existing objects. I think the most
object-oriented API is one where all methods are const-qualified
except for operator=.

>
> .inverse() and .adjoint() would return simple proxy with operator *
> overloaded to return a "solve" proxy that we already have (so
> implementation-wise that's not much).
>
> I know that computer scientists used to other linear algebra packages
> might be shocked by such an API and will look forever for a function
> having "solve" in its name, but this is pure Eigen style IMO and it is
> consistent with the rest of the API (diagonal, permutation,
> transposition already use this API).

Yes, keeping the existing imperative solve() API is very important too.

The two can coexist, the only potential conflict that I see is in LU
classes that already have a inverse() method, we have to make sure
(say for beta2) that this doesn't break existing behavior (it's ok to
break some corner cases until 3.0 is released).

> It remains the problem of convenient "inplace" solving. Currently, when doing:

m.lu().inverse().applyOnTheLeft(x);
m.lu().inverse().applyOnTheRight(x);

as we already do in Householder?

Finally, let's discuss another issue: for decompositions that allow
non-invertible matrices, such as FullPivLU, calling this method
inverse() is a bit problematic. So here's a proposal. Since the
imperative API is:
m.lu().solve(x);
how about calling this method solve() so that the new API is really
very consistent with the old one:
m.lu().solve() * x;
Now you might object that since this method just returns an object, it
should be named not as a verb, but as a noun. So how about solver() ?
m.lu().solver() * x;

Either way (calling it solve() or solver()) we also solve the problem
of the potential conflict with existing inverse() methods (yes I know
these inverse() methods already return proxies, but those
ReturnByValue proxies offer much more API than we have to here, since
they are real expressions).

Cheers,
Benoit

> v = m.lu().solve(v);
>
> we can detect that we have the same object for the "x" and the "b" via
> one runtime pointer comparison, and then skip the matrix copy of b to
> x before the inplace solving path. So the same mechanism would still
> work:
>
> b = b * m.lu().adjoint().inverse();
>
> => inplace solving, no temporary, no matrix copy (except for the
> decomposition itself but that's another story).
>
> The pb, is when "b" is a complex expression:
>
> b.block(i+s, j+block_size-1, rows-r, std::min(blocksize,cols))
> =m.lu().inverse() * b.block(i+s, j+block_size-1, rows-r,
> std::min(blocksize,cols));
>
> You don't want to write it twice!!
>
> Here a solveInPlace() function could make sense:
>
> m.lu().solveInPlace(b.block(i+s, j+block_size-1, rows-r,
> std::min(blocksize,cols)));
>
> better ? not so sure:
> 1) what about "ontheright" and transposed variants ?
> 2) and even worse, this does not work because block(...) returns a
> temporary object which has to be passed by const reference, so you
> have to name it anyway:
>
> Block<...> b_block(n,i+s, j+block_size-1, rows-r, std::min(blocksize,cols));
>
> and then repeating twice b_block is fine.
>
> As some of you probably guessed, I did not slept well this night....
>
> cheers,
>
> gael
>
>
>

 Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/