Re: [eigen] Skyline matrix

[ Thread Index | Date Index | More Archives ]

Ok, I've put it in the unsupported/ folder. To use it you just have add path/to/eigen2/unsupported to your set of include paths.

Since I guess you will certainely want to do some modifications in the future, I recommend you to create an account on bitbucket, and I will give you write access. Feel free to commit any changes you like in this module, but if your changes affect some other modules, please let's first dicuss them here !

Now I have to admit there are still a lot of work to do before having something really usable to offer to other users. In my opinion, the SkylineMatrix class should be taillored to be optimal for the two following tasks:
1 - setup/assemble the matrix
2 - build efficient high-level algorithms

Both tasks require a powerful API taking into account the specificities of the skyline storage. In particular this discard low level coeff(i,j) methods or things like that. Iterators are not very well suited here because 1) you need various kind of iterators (upper, lower), and 2) they cannot take advantage of the sequential storage to enable unrolling and vectorization.

So, for the matrix assembly, the challenge is to reduce the need for memory reallocation and memory copies. Here we first need a very low level API allowing to fill the matrix with constant time insertions. Then to offer more flexibility to the user, perhaps the simpler is to let the user fill a {Dynamic}SparseMatrix, and then implement a fast conversion method. The latter is a must to have feature anyway.

In order to easily implement efficient algorithms (products and solvers), the SkylineMatrix should provide higher level access functions than the coeff() function or the iterators. In particular I'm thinking about the following set of functions:

diagonal(); // return a vector _expression_ of the diagonal
lowerCol(j);    // returns an _expression_ of the j-th column of the strictly lower triangular part
centeredBlock(i, rows, cols); // returns an _expression_ of the rows x cols submatrix at the coeff (i, i)

The first two functions would return a special _expression_ wrapping a dense VectorBlock plus a start index and a length with optimized arithmetic operators. Such an _expression_, let's call it a RangeVector could also be useful for the Band storage.

Then writting a  triangular solver, a LU decomposition, a Cholesky decomposition, etc. would be almost trivial. Actually, if we add the concept of matrix partitionner as I described in another thread, one could write the exact same code for both a dense or a skyline matrix !

I'm also thinking whether it could make sense to make the "bandwidth" of the matrix a template parameter. The default would be "Dynamic". This would allow to optimize the code for matrices having a small fixed size bandwidth.


On Mon, Nov 9, 2009 at 4:30 PM, guillaume saupin <guillaume.saupin@xxxxxx> wrote:
Hi !

Here comes the cleaned code for the skyline matrix. If you could integrate it inside Eigen, it would be perfect.



guillaume saupin wrote:
Gael Guennebaud wrote:

On Thu, Nov 5, 2009 at 2:40 PM, guillaume saupin <guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>> wrote:

   Gael Guennebaud wrote:


       thanks for the patch !

       I'd be ok to have it in Eigen. In a first step I would
       probably put it in the unsupported/ set of modules because it
       sems quite preliminary for now.

   No problem. This is perfect if we can access this Matrix from
   eigen. I've written some unit tests with boost. Where should I put
   them ?

       However before we can apply the patch some cleaning is
       required: at the very least the copyrights have to be updated.
       It would also be nice to remove all the commented stuff.

       Then we can discuss about the usefulness of having a
       SkylineMatrixBase. Indeed, such a base class can be needed for
       two reasons:

   The truth is that I have not really thought about that. I just
   copy/paste the SparseMatrix code ;)

       1 - To implement the _expression_ template paradigm for basics
       operations (+, -, scaling, coefficients wise operations like
       abs, sin, cos etc.). However, here I'm not sure that makes a
       lot of sense. It seems to me that a such a specialized matrix
       is only usefull for solving problems, so supporting such
       operators is not really needed, right ?

   Our main use for SkylineMatrix is for solving systems.
   Nevertheless we will soon use at least a Skyline * Dense product,
   and probably a Dense * Skyline product. So template expressions
   may be required to do these products without temporaries and with
   a simple syntax. But it is maybe possible to do that without a
   Base class ?

Yes and no, because actually matrix products are special in the sense that they must be directly evaluated. On the other hand ET might still be useful to allow for more complex products involving transpose, negate and scaling ops.

I will need this kind of complex products, especially those involving transpose, in my ColumnMatrix. Do you think that it must inherit MatrixBase to benefit from ET ?
The problem is that our current design does not allow to reuse our Transpose/CwiseUnaryOp/etc. _expression_ classes for something else than dense matrices. So I'm more and more convinced that our ET mechanism should be updated to be more flexible. But that's another topic !


       2 - Another reason would be to have various SkylineMatrix
       storage variants which would be sufficiently different to
       require different classes, but close enough to share the same
       (solver) algorithms. Again, I'm not sure that's the case here.

   Right. I don't think too that we'll introduce another storage.

       Nevertheless, it's still fine to keep it for now, and wait a
       bit for more use cases to see what's the best strategy.

   Thanks anyway for your advises. I'll clean the code and send it
   back to you.

ok, cool.




       On Wed, Nov 4, 2009 at 3:24 PM, guillaume saupin
       <guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>
       <mailto:guillaume.saupin@xxxxxx>>> wrote:

          You'll find enclosed the code as a patch.

          guillaume saupin wrote:

              Hello !

              I've finally (after some performances tests) implemented a
              basic skyline matrix using the following storage :

              - an array stores the diagonal elements
              - an array stores the upper elements in a column
       oriented way
              - an array stores the lower elements in a row oriented way

              I've implemented a LU decomposition for this format. I
       use the
              following algorithm (with minor modification to handle the
              sparseness of the matrix):

              for each row
              -- compute the lower part
                for each col < row
                   LU(row, col) =  (LU(row, col)  - sum(k = 0 ; k < col)
              [LU(row, k)  * LU(k, col)] ) / LU(row, row) -- elements are
              access contiguously / Can be easily vectorize
              -- compute the upper part
              for each rrow < row
                LU(rrow, row) -= sum(k = 0 ; k < rrow) [LU(rrow, k)
        * LU(k,
              row)] -- elements are access contiguously / Can be easily
              -- compute the diag part
              LU(row, row) -= sum(k = 0 ; k < row) [LU(rrow, k)  * LU(k,
              row)] -- elements are access contiguously / Can be easily

              This algorithm is well suited for our storage format.
       After a
              few tests, it seems even more efficient to store the
       the upper
              and lower elements in the same array, with the elements
              one row followed by the elements of the cols of same index,
              and so on ...

              I've also implemented a product for the Skyline *  Dense

              Do you think that it would be possible to insert this code
              into eigen ?

              guillaume saupin wrote:

                  Gael Guennebaud a écrit :

                      On Tue, Oct 27, 2009 at 10:40 AM, guillaume saupin
       <mailto:guillaume.saupin@xxxxxx>>>> wrote:

                         Gael Guennebaud a écrit :

                             this is not an easy question because we are
                      still unsure how
                             to manage efficiently and with minimal
       code all
                      these kind of
                             "special" matrices.

                             Of course an easy solution would be to
       make it
                             SparseMatrixBase and let the generic
                      based on
                             sparse iterators do the job. However, this
                      approach won't be
                             optimal because the iterators won't take
                      advantage of the
                             specificity of a SkylineMatrix which, e.g.,
                      should allow

                             Actually skyline matrices are very
       similar to
                      banded matrices,
                             and in particular it seems to me that
       they are
                      similar enough
                             to share the same algorithms. So for
                      these two kind
                             of matrices could inherit the same
       matrix base
                      and the
                             algorithms could be implemented in a generic
                      way via the
                             concept of "range inner vector" which is a
                      column (or a row)
                             vector with an associated start index... We
                      also need
                             efficient basic operators dealing with such
                      vector. E.g. the
                             addition of two such vector can be
                      done by first
                             copying the non overlapping part (start and
                      end), and then sum
                             the overlapping part using Eigen's core

                         The way we implemented skyline matrix is not
                      similar to
                         what is usually done for banded matrix, as
       we store
                      the diagonal,
                         the upper, and the lower elements in separate
                      arrays. This way, we
                         can achieve quite efficient LU decomposition
                      solving) due to
                         coherent  memory accesses.

                      ok I see, then it is not really what people call
                      skyline storage
                      but it is more like an extension of the tridiagonal

                  I think that our storage is actually quite similar
       to the
                  one proposed there for non symetric skyline
       (referred as
                  NSK format by Saad) matrix. As in these format, we
                  the upper element in a column-oriented  way, and
       the lower
                  element in a row-oriented way.
                  The only difference is that we store the diagonal
                  in a separate array instead of keeping an index
       array to
                  these elements. This allows contiguous access to

                      However, I don't see how your storage can yield
       to a
                      more efficient implementation of the LU
                      than the standard skyline storage. For instance
                      is a basic algorithm without pivoting for the
                      skyline storage:

                      for(int k = 0; k+1 < rows; ++k)
                       int rrows = rows-k-1;
                       int rsize = size-k-1;

                       // line 1
                       lu.col(k).end(rrows) /= lu.coeff(k,k);

                       // line 2
                       lu.corner(BottomRight,rrows,rsize).noalias() -=
                      lu.col(k).end(rrows) * lu.row(k).end(rsize);

                      where lu is the working matrix, lu.col(k) is
                      to return a "range vector" as I described in my
                      previous email.

                      Here line 1 would be trivially optimized (i.e.,
                      vectorized) since lu.col(k).end(rrows) is just
       a small
                      dense vector.

                      Line 2 is an outer product which again is
                      trivially/automatically vectorized sicne it is
                      impelmented as a sequence of:
       "col_range_vector_i -=
                      scalar * col_range_vector".

                      Here the locality is pretty good because the vector
                      "lu.col(k).end(rrows)" which is reused multiple
                      is sequentially stored in memory.

                      But perhaps there exists a special algorithm which
                      perfectly fit your storage ? Is it possible to see
                      your code somewhere ? Finally, if it is really more
                      efficient then that would make sense to have it
       in Eigen.

                  The benefit of using the NSK, i.e. of storing the lower
                  matrix in a column-oriented way, and the upper
       matrix in a
                  row-oriented way is that in line 1 we access contigous
                  elements (the column k of the lower matrix), and in
       line 2
                  we also access contiguous element (the row elements
       of the
                  upper matrix). I don't know if it's efficient for
                  vectorization, but without, it is quite efficient,
       as the
                  locality is very good.


                         So I don't think that our algorithms share
                      features with
                         band matrix algorithms. They seem closer to what
                      can be done with
                         diagonal matrices.

                         Anyway, I've used the SparseMatrix code as a
                      for my
                         SkylineMatrix code. Currently it only
       supports very
                         operations. I'll implement the LU decomposition
                      soon. I will also
                         try to use vectorization.


                             These are just some initial thoughts and the
                      discussion is
                             very open!


                             On Mon, Oct 19, 2009 at 12:55 PM,
       guillaume saupin
       <mailto:guillaume.saupin@xxxxxx>>>>> wrote:


                                We are planning to use your library
       in our
                      projects, but we
                             need a
                                skyline matrix. Therefore I'd like to
                      implement one for
                             eigen, but
                                I don't now where to start.

                                Is there a specific class that can be
       a good
                      starting point /
                                skeleton to start with ? The
                      SparseMatrixBase might be a
                             good choice.
                                Should this SkylineMatrix inherit from
                      SparseMatrixBase, or
                             be a
                                separate class ?



Mail converted by MHonArc 2.6.19+