Re: [eigen] solve API

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


thank you for taking part to the discussion :)

On Mon, Jun 28, 2010 at 4:42 PM, Manoj Rajagopalan <rmanoj@xxxxxxxxx> wrote:
>
>
>
> On Monday 28 June 2010 05:42:49 am Gael Guennebaud wrote:
>
>> Seriously, currently our solvers only allows you to compute x = A^1 b
>> via some decompositions of A. However, it is sometime needed to also
>> solve for x = A^-T b or x = b A^1.
>
> ==== Great! This is very useful in building (left-)right-preconditioned sparse solvers using Eigen as
> the implementation.
>
> Here are my 2 bits. I am still learning how Eigen3 is implemented so some of my suggestions
> might not be implementable because of difficulties with template expressions but please educate me.
>
>
> [I] How about free functions: (with appropriate template params)
>
>   MatrixBase solve(MatrixBase const& A, MatrixBase const& b); // on-left
>   MatrixBase solve(MatrixBase const& b, MatrixBase const& A); // on-right

how do you distinguish between both functions, they have the same prototype..

>
>  Usage:
>
>    x = solve(A,b); // left
>    x = solve(b,A); // right

same, when I see:

X = solve(B,A);

do you want X = B^-1 A or X = B A^-1 ??? we cannot know for you. (X
and B can be matrices, not only vectors).

Another problem, is that often we need a decomposition and then use it
multiple times, for solving but only. See for instance the rich API of
the LU dec. Your proposal does not cover such use cases.

>    x = solve(b, A.triangularView<Lower>()); // Gael's example
>
> **The solve() function would return some matrix-expression (a "proxy" in eigen-speak?)
> that would get evaluated and compiled into the appropriate left- or right- solution when
> the = expression is evaluated. At that time pointer checks would resolve if y and x are
> aliased so that in-place solve could be called instead?

neglecting the above issues, yes

>
> ** The issue with this approach is which method to use for solving? LU or other? With A.lu().solve()
>    this issue is pre-empted.
>
> ** We could rename the free solve() function to lu_solve() to circumvent this issue. We could also provide
>    an optional template parameter to specify full or partial pivoting.


we are speaking at different levels. Here I assume the user has a
decomposition at hand and I want a nice and flexible API to use it.
For instance, in the future we'll allow update/downdate of a LLT
factorization, etc. Basically, a decomposition is just a different way
to represent a given matrix. Here is an example:

LLT<> llt(mat);

x = mat.inverse() * b;
llt.update(...);
x = mat.inverse() * b;
llt.update(...);
x = mat.inverse() * b;
....


>
> ** This would still, internally, use the Eigen class API to implement the actual solution (Gael's/Benoit's suggestions)
>    but the user would remain insulated from any such changes. The free functions would export a standard,
>    readable interface and hide the details from the user.

Yes what you are suggesting is something build on top of a lower level
API. This is something a user can do for himself if he needs it, but
I'm feeling that's out the scope of Eigen.

>
> ** For "b" being a complex expression we could have a free solve_in_place()
>    // on left
>    solve_in_place(A.selfadjointView<Upper>(),
>                   b.block(i+s, j+block_size-1, rows-r, std::min(blocksize,cols)));
>    // on right
>    solve_in_place(b.block(i+s, j+block_size-1, rows-r, std::min(blocksize,cols)),
>                   A.selfadjointView<Upper>());
>
> ***   The block above will work only if passed by value (temporary) but this special-case can be
> handled by template specialization I suppose.

unfortunately not. The only "simple" solution is to use const
references for the arguments and const cast them... that's not ideal.

> *** The free solve() functions could internally call solve_in_place() if x and b are aliases. The user will have
> some choice too.
>
>
> [II] x = m.lu().solve<OnTheRight | Adjoint>(b);
>   seems ok to me. Is the following a little better and possible?
>
>    x = m.adjoint().lu().solve<OnTheRight>(b);

yes, but:
- this requires one useless transposition
- what if I want to compute both m^-1 * b and m^-T * b ? I would have
to do the decomposition twice, that is exactly what I want to avoid.

>
>   (see also, item [IV] below)
>
>
> [III] IMHO, m.lu().solve(x)  is more readable than  m.lu().solve() * x  or m.lu().solver()*x
>   - the latter is a deviation from the semantics that the inverse is an operator or function
>    that operates on an input vector. If "solve" or "solver" were replaced with "inverse" this
>    would be more clear. But implementation issues you refer to might make this difficult.

I agree, inverse is better :) and I'm optimistic regarding the implementation.

Maybe a drawback is that a user would be more tempted to use inverse()
into complex expressions and explicitly evaluate the inverse, e.g.:

x = 2 * lu.inverse().adjoint() * b;

because of operator priority, this it will first evaluate (2 *
lu.inverse().adjoint()), and thus explicitly compute
lu.inverse().adjoint() to a matrix....

x = 2 * (lu.inverse().adjoint() * b);

is better but still yield a temporary.

x = lu.inverse().adjoint() * (2*b);

is the preferred syntax.

One solution is to allow only three operations on the Inverse proxy:
operator * with a MatrixBase, assignment to a MatrixBase, and .eval().

>
> * I don't know how .inverse() is implemented now, but it could return an expression (proxy?) which,
>  when appears in an '*' expression, could be expanded into an LU solution as opposed to multiplication
>  by a computed inverse. But the Matrix class will need to have one more definition for the = operator
>  to check for the inverse-expression and to evaluate the inverse from the expression template.

yes, this is more or less the plane.

>
>
> [IV]  Since we are applying A^{-1} or A^{-H/T}, IMHO
>
>    b * A.adjoint().lu().inverse()
>
> seems better to me than     b * A.lu().adjoint().inverse() (Gael)  or  b * A.lu().inverse().adjoint()  (Benoit).
> This is because the first substring, A or A.adjoint(), represents the operation immediately, and then .lu().inverse()
> would represent the details of the method being used to compute that inverse expression.

see above why this is not enough, but of course your expression would
still be valid. For instance you can already do:

x = A.adjoint().lu().solve(b);

Actually, you can already solve on the right (x = b A^-1) by doing:

x.transpose() = A.transpose().lu().solve(b.transpose());

but that's not very convenient, and does not cover all use cases
(e.g., when you need to solve both on the right and on the left).


cheers,

gael



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/