[eigen] solve API

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

I was thinking that we did not discussed the solved API for a while,
at so it would be perfect time to discuss it again ;)

Seriously, currently our solvers only allows you to compute x = A^1 b
via some decompositions of A. However, it is sometime needed to also
solve for x = A^-T b or x = b A^1.

A naive way to add support for that would be to add template options
to the solve functions, as we currently do for triangular solving:

M.triangularView<Lower>().solveInPlace<OnTheRight>(b);

<=>

b = b * lower(M)^-1;

For "b = lower(M)^-* * b" you do:

M.triangularView<Lower>().adjoint().solveInPlace(b);

(the "OnTheLeft" is implicit)

To extend this API to the solver we could naively add a "Transpose" or
"Adjoint" option:

x = m.lu().solve<OnTheRight | Adjoint>(b);

which is ... well, ... pure Lapack style ;)

This API is really confusing and not clear at all. Note that
"OnTheRight/Left" is also used in function names such as:

a.applyOnTheRight(b);

Here it is pretty clear that's a which is on the right, and so we are doing:

b = b * a;

However,

x = m.lu().solve<OnTheRight>(b);

makes no clear sense. It is not clear whether it is b which is on the
right, or the inverse of m.

Recall that one goal is to be able to write code as you write math. So
here is the proposal:

x = m.lu().inverse() * b;
x = b * m.lu().inverse();
x = m.lu().adjoint().inverse() * b;
x = b * m.lu().adjoint().inverse();

..inverse() and .adjoint() would return simple proxy with operator *
overloaded to return a "solve" proxy that we already have (so
implementation-wise that's not much).

I know that computer scientists used to other linear algebra packages
might be shocked by such an API and will look forever for a function
having "solve" in its name, but this is pure Eigen style IMO and it is
consistent with the rest of the API (diagonal, permutation,
transposition already use this API).

It remains the problem of convenient "inplace" solving. Currently, when doing:

v = m.lu().solve(v);

we can detect that we have the same object for the "x" and the "b" via
one runtime pointer comparison, and then skip the matrix copy of b to
x before the inplace solving path. So the same mechanism would still
work:

b = b * m.lu().adjoint().inverse();

=> inplace solving, no temporary, no matrix copy (except for the
decomposition itself but that's another story).

The pb, is when "b" is a complex expression:

b.block(i+s, j+block_size-1, rows-r, std::min(blocksize,cols))
=m.lu().inverse() * b.block(i+s, j+block_size-1, rows-r,
std::min(blocksize,cols));

You don't want to write it twice!!

Here a solveInPlace() function could make sense:

m.lu().solveInPlace(b.block(i+s, j+block_size-1, rows-r,
std::min(blocksize,cols)));

better ? not so sure:
1) what about "ontheright" and transposed variants ?
2) and even worse, this does not work because block(...) returns a
temporary object which has to be passed by const reference, so you
have to name it anyway:

Block<...> b_block(n,i+s, j+block_size-1, rows-r, std::min(blocksize,cols));

and then repeating twice b_block is fine.

As some of you probably guessed, I did not slept well this night....

cheers,

gael



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/