Re: [eigen] Skyline matrix

On Mon, Nov 9, 2009 at 4:30 PM, guillaume saupin <guillaume.saupin@xxxxxx> wrote:

Hi !

Here comes the cleaned code for the skyline matrix. If you could integrate it inside Eigen, it would be perfect.

thanks,

guillaume

guillaume saupin wrote:

Gael Guennebaud wrote:

On Thu, Nov 5, 2009 at 2:40 PM, guillaume saupin <guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>> wrote:

Gael Guennebaud wrote:

Hi,

thanks for the patch !

I'd be ok to have it in Eigen. In a first step I would
probably put it in the unsupported/ set of modules because it
sems quite preliminary for now.

No problem. This is perfect if we can access this Matrix from
eigen. I've written some unit tests with boost. Where should I put
them ?

However before we can apply the patch some cleaning is
required: at the very least the copyrights have to be updated.
It would also be nice to remove all the commented stuff.

Then we can discuss about the usefulness of having a
SkylineMatrixBase. Indeed, such a base class can be needed for
two reasons:

The truth is that I have not really thought about that. I just
copy/paste the SparseMatrix code ;)

1 - To implement the _expression_ template paradigm for basics
operations (+, -, scaling, coefficients wise operations like
abs, sin, cos etc.). However, here I'm not sure that makes a
lot of sense. It seems to me that a such a specialized matrix
is only usefull for solving problems, so supporting such
operators is not really needed, right ?

Our main use for SkylineMatrix is for solving systems.
Nevertheless we will soon use at least a Skyline * Dense product,
and probably a Dense * Skyline product. So template expressions
may be required to do these products without temporaries and with
a simple syntax. But it is maybe possible to do that without a
Base class ?

Yes and no, because actually matrix products are special in the sense that they must be directly evaluated. On the other hand ET might still be useful to allow for more complex products involving transpose, negate and scaling ops.

I will need this kind of complex products, especially those involving transpose, in my ColumnMatrix. Do you think that it must inherit MatrixBase to benefit from ET ?

The problem is that our current design does not allow to reuse our Transpose/CwiseUnaryOp/etc. _expression_ classes for something else than dense matrices. So I'm more and more convinced that our ET mechanism should be updated to be more flexible. But that's another topic !

2 - Another reason would be to have various SkylineMatrix
storage variants which would be sufficiently different to
require different classes, but close enough to share the same
(solver) algorithms. Again, I'm not sure that's the case here.

Right. I don't think too that we'll introduce another storage.

Nevertheless, it's still fine to keep it for now, and wait a
bit for more use cases to see what's the best strategy.

Thanks anyway for your advises. I'll clean the code and send it
back to you.

ok, cool.

gael.

gael.

On Wed, Nov 4, 2009 at 3:24 PM, guillaume saupin
<guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>>> wrote:

You'll find enclosed the code as a patch.

guillaume saupin wrote:

Hello !

I've finally (after some performances tests) implemented a
basic skyline matrix using the following storage :

- an array stores the diagonal elements
- an array stores the upper elements in a column
oriented way
- an array stores the lower elements in a row oriented way

I've implemented a LU decomposition for this format. I
use the
following algorithm (with minor modification to handle the
sparseness of the matrix):

for each row
-- compute the lower part
for each col < row
LU(row, col) = (LU(row, col) - sum(k = 0 ; k < col)
[LU(row, k) * LU(k, col)] ) / LU(row, row) -- elements are
access contiguously / Can be easily vectorize
-- compute the upper part
for each rrow < row
LU(rrow, row) -= sum(k = 0 ; k < rrow) [LU(rrow, k)
* LU(k,
row)] -- elements are access contiguously / Can be easily
vectorize
-- compute the diag part
LU(row, row) -= sum(k = 0 ; k < row) [LU(rrow, k) * LU(k,
row)] -- elements are access contiguously / Can be easily
vectorize

This algorithm is well suited for our storage format.
After a
few tests, it seems even more efficient to store the
the upper
and lower elements in the same array, with the elements
from
one row followed by the elements of the cols of same index,
and so on ...

I've also implemented a product for the Skyline * Dense
operation.

Do you think that it would be possible to insert this code
into eigen ?

guillaume saupin wrote:

Gael Guennebaud a écrit :

On Tue, Oct 27, 2009 at 10:40 AM, guillaume saupin
<guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>>>> wrote:

Gael Guennebaud a écrit :

this is not an easy question because we are
still unsure how
to manage efficiently and with minimal
code all
these kind of
"special" matrices.

Of course an easy solution would be to
make it
inherit
SparseMatrixBase and let the generic
algorithms
based on
sparse iterators do the job. However, this
approach won't be
optimal because the iterators won't take
advantage of the
specificity of a SkylineMatrix which, e.g.,
should allow
vectorization.

Actually skyline matrices are very
similar to
banded matrices,
and in particular it seems to me that
they are
similar enough
to share the same algorithms. So for
instance
these two kind
of matrices could inherit the same
matrix base
and the
algorithms could be implemented in a generic
way via the
concept of "range inner vector" which is a
column (or a row)
vector with an associated start index... We
also need
efficient basic operators dealing with such
vector. E.g. the
addition of two such vector can be
efficiently
done by first
copying the non overlapping part (start and
end), and then sum
the overlapping part using Eigen's core
module
feature.

The way we implemented skyline matrix is not
really
similar to
what is usually done for banded matrix, as
we store
the diagonal,
the upper, and the lower elements in separate
arrays. This way, we
can achieve quite efficient LU decomposition
(and
solving) due to
coherent memory accesses.

ok I see, then it is not really what people call
skyline storage
(http://www.netlib.org/linalg/html_templates/node96.html),
but it is more like an extension of the tridiagonal
storage.

I think that our storage is actually quite similar
to the
one proposed there for non symetric skyline
(referred as
NSK format by Saad) matrix. As in these format, we
store
the upper element in a column-oriented way, and
the lower
element in a row-oriented way.
The only difference is that we store the diagonal
elements
in a separate array instead of keeping an index
array to
these elements. This allows contiguous access to
diagonal
elements.

However, I don't see how your storage can yield
to a
more efficient implementation of the LU
decomposition
than the standard skyline storage. For instance
here
is a basic algorithm without pivoting for the
standard
skyline storage:

for(int k = 0; k+1 < rows; ++k)
{
int rrows = rows-k-1;
int rsize = size-k-1;

// line 1
lu.col(k).end(rrows) /= lu.coeff(k,k);

// line 2
lu.corner(BottomRight,rrows,rsize).noalias() -=
lu.col(k).end(rrows) * lu.row(k).end(rsize);
}

where lu is the working matrix, lu.col(k) is
assumed
to return a "range vector" as I described in my
previous email.

Here line 1 would be trivially optimized (i.e.,
vectorized) since lu.col(k).end(rrows) is just
a small
dense vector.

Line 2 is an outer product which again is
trivially/automatically vectorized sicne it is
impelmented as a sequence of:
"col_range_vector_i -=
scalar * col_range_vector".

Here the locality is pretty good because the vector
"lu.col(k).end(rrows)" which is reused multiple
times
is sequentially stored in memory.

But perhaps there exists a special algorithm which
perfectly fit your storage ? Is it possible to see
your code somewhere ? Finally, if it is really more
efficient then that would make sense to have it
in Eigen.

The benefit of using the NSK, i.e. of storing the lower
matrix in a column-oriented way, and the upper
matrix in a
row-oriented way is that in line 1 we access contigous
elements (the column k of the lower matrix), and in
line 2
we also access contiguous element (the row elements
of the
upper matrix). I don't know if it's efficient for
vectorization, but without, it is quite efficient,
as the
locality is very good.

gael.

So I don't think that our algorithms share
common
features with
band matrix algorithms. They seem closer to what
can be done with
diagonal matrices.

Anyway, I've used the SparseMatrix code as a
base
for my
SkylineMatrix code. Currently it only
supports very
basic
operations. I'll implement the LU decomposition
soon. I will also
try to use vectorization.

guillaume

These are just some initial thoughts and the
discussion is
very open!

gael

On Mon, Oct 19, 2009 at 12:55 PM,
guillaume saupin
<guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>>>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>
<mailto:guillaume.saupin@xxxxxx
<mailto:guillaume.saupin@xxxxxx>>>>> wrote:

Hello,

We are planning to use your library
in our
projects, but we
need a
skyline matrix. Therefore I'd like to
implement one for
eigen, but
I don't now where to start.

Is there a specific class that can be
a good
starting point /
skeleton to start with ? The
SparseMatrixBase might be a
good choice.
Should this SkylineMatrix inherit from
SparseMatrixBase, or
be a
separate class ?

Thanks,

guillaume