Re: [eigen] Skyline matrix

[ Thread Index | Date Index | More Archives ]

Gael Guennebaud wrote:

Ok, I've put it in the unsupported/ folder. To use it you just have add path/to/eigen2/unsupported to your set of include paths.

Since I guess you will certainely want to do some modifications in the future, I recommend you to create an account on bitbucket, and I will give you write access. Feel free to commit any changes you like in this module, but if your changes affect some other modules, please let's first dicuss them here !

I've just created my account on bitbucket. My id is simply guillaume.saupin.

Now I have to admit there are still a lot of work to do before having something really usable to offer to other users. In my opinion, the SkylineMatrix class should be taillored to be optimal for the two following tasks:
1 - setup/assemble the matrix
2 - build efficient high-level algorithms

Both tasks require a powerful API taking into account the specificities of the skyline storage. In particular this discard low level coeff(i,j) methods or things like that. Iterators are not very well suited here because 1) you need various kind of iterators (upper, lower), and 2) they cannot take advantage of the sequential storage to enable unrolling and vectorization.

So, for the matrix assembly, the challenge is to reduce the need for memory reallocation and memory copies. Here we first need a very low level API allowing to fill the matrix with constant time insertions.
One possibility is to let the user specify the profile for each rows and cols, and allocate the memory accordingly. Then the insertion can be done in constant time.

Then to offer more flexibility to the user, perhaps the simpler is to let the user fill a {Dynamic}SparseMatrix, and then implement a fast conversion method. The latter is a must to have feature anyway.

If we provide an easy and fast API to fill the matrix, I'm not sure that it will be very useful to create a SkylineMatrix from a SparseMatrix.

In order to easily implement efficient algorithms (products and solvers), the SkylineMatrix should provide higher level access functions than the coeff() function or the iterators. In particular I'm thinking about the following set of functions:

diagonal(); // return a vector expression of the diagonal
lowerCol(j); // returns an expression of the j-th column of the strictly lower triangular part
centeredBlock(i, rows, cols); // returns an expression of the rows x cols submatrix at the coeff (i, i)

The first two functions would return a special expression wrapping a dense VectorBlock plus a start index and a length with optimized arithmetic operators. Such an expression, let's call it a RangeVector could also be useful for the Band storage.

This is a good idea. What can I use as a starting point to implement those VectorBlocks ? Should it inherit from an already existing class ?

Then writting a triangular solver, a LU decomposition, a Cholesky decomposition, etc. would be almost trivial. Actually, if we add the concept of matrix partitionner as I described in another thread, one could write the exact same code for both a dense or a skyline matrix !

I'm also thinking whether it could make sense to make the "bandwidth" of the matrix a template parameter. The default would be "Dynamic". This would allow to optimize the code for matrices having a small fixed size bandwidth.

This could be a good idea. What kind of optimization can be done on such SkylineMatrices ?



On Mon, Nov 9, 2009 at 4:30 PM, guillaume saupin <guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>> wrote:

    Hi !

    Here comes the cleaned code for the skyline matrix. If you could
    integrate it inside Eigen, it would be perfect.



    guillaume saupin wrote:

        Gael Guennebaud wrote:

            On Thu, Nov 5, 2009 at 2:40 PM, guillaume saupin
            <guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>
            <mailto:guillaume.saupin@xxxxxx>>> wrote:

               Gael Guennebaud wrote:


                   thanks for the patch !

                   I'd be ok to have it in Eigen. In a first step I would
                   probably put it in the unsupported/ set of modules
            because it
                   sems quite preliminary for now.

               No problem. This is perfect if we can access this
            Matrix from
               eigen. I've written some unit tests with boost. Where
            should I put
               them ?

                   However before we can apply the patch some cleaning is
                   required: at the very least the copyrights have to
            be updated.
                   It would also be nice to remove all the commented

                   Then we can discuss about the usefulness of having a
                   SkylineMatrixBase. Indeed, such a base class can be
            needed for
                   two reasons:

               The truth is that I have not really thought about that.
            I just
               copy/paste the SparseMatrix code ;)

                   1 - To implement the expression template paradigm
            for basics
                   operations (+, -, scaling, coefficients wise
            operations like
                   abs, sin, cos etc.). However, here I'm not sure
            that makes a
                   lot of sense. It seems to me that a such a
            specialized matrix
                   is only usefull for solving problems, so supporting
                   operators is not really needed, right ?

               Our main use for SkylineMatrix is for solving systems.
               Nevertheless we will soon use at least a Skyline *
            Dense product,
               and probably a Dense * Skyline product. So template
               may be required to do these products without
            temporaries and with
               a simple syntax. But it is maybe possible to do that
            without a
               Base class ?

            Yes and no, because actually matrix products are special
            in the sense that they must be directly evaluated. On the
            other hand ET might still be useful to allow for more
            complex products involving transpose, negate and scaling ops.

        I will need this kind of complex products, especially those
        involving transpose, in my ColumnMatrix. Do you think that it
        must inherit MatrixBase to benefit from ET ?

            The problem is that our current design does not allow to
            reuse our Transpose/CwiseUnaryOp/etc. expression classes
            for something else than dense matrices. So I'm more and
            more convinced that our ET mechanism should be updated to
            be more flexible. But that's another topic !

                   2 - Another reason would be to have various
                   storage variants which would be sufficiently
            different to
                   require different classes, but close enough to
            share the same
                   (solver) algorithms. Again, I'm not sure that's the
            case here.

               Right. I don't think too that we'll introduce another

                   Nevertheless, it's still fine to keep it for now,
            and wait a
                   bit for more use cases to see what's the best strategy.

               Thanks anyway for your advises. I'll clean the code and
            send it
               back to you.

            ok, cool.



                   On Wed, Nov 4, 2009 at 3:24 PM, guillaume saupin
            <mailto:guillaume.saupin@xxxxxx>>>> wrote:

                      You'll find enclosed the code as a patch.

                      guillaume saupin wrote:

                          Hello !

                          I've finally (after some performances tests)
            implemented a
                          basic skyline matrix using the following
            storage :

                          - an array stores the diagonal elements
                          - an array stores the upper elements in a column
                   oriented way
                          - an array stores the lower elements in a
            row oriented way

                          I've implemented a LU decomposition for this
            format. I
                   use the
                          following algorithm (with minor modification
            to handle the
                          sparseness of the matrix):

                          for each row
                          -- compute the lower part
                            for each col < row
                               LU(row, col) =  (LU(row, col)  - sum(k
            = 0 ; k < col)
                          [LU(row, k)  * LU(k, col)] ) / LU(row, row)
            -- elements are
                          access contiguously / Can be easily vectorize
                          -- compute the upper part
                          for each rrow < row
                            LU(rrow, row) -= sum(k = 0 ; k < rrow)
            [LU(rrow, k)
                    * LU(k,
                          row)] -- elements are access contiguously /
            Can be easily
                          -- compute the diag part
                          LU(row, row) -= sum(k = 0 ; k < row)
            [LU(rrow, k)  * LU(k,
                          row)] -- elements are access contiguously /
            Can be easily

                          This algorithm is well suited for our
            storage format.
                   After a
                          few tests, it seems even more efficient to
            store the
                   the upper
                          and lower elements in the same array, with
            the elements
                          one row followed by the elements of the cols
            of same index,
                          and so on ...

                          I've also implemented a product for the
            Skyline *  Dense

                          Do you think that it would be possible to
            insert this code
                          into eigen ?

                          guillaume saupin wrote:

                              Gael Guennebaud a écrit :

                                  On Tue, Oct 27, 2009 at 10:40 AM,
            guillaume saupin
            <mailto:guillaume.saupin@xxxxxx>>>>> wrote:

                                     Gael Guennebaud a écrit :

                                         this is not an easy question
            because we are
                                  still unsure how
                                         to manage efficiently and
            with minimal
                   code all
                                  these kind of
                                         "special" matrices.

                                         Of course an easy solution
            would be to
                   make it
                                         SparseMatrixBase and let the
                                  based on
                                         sparse iterators do the job.
            However, this
                                  approach won't be
                                         optimal because the iterators
            won't take
                                  advantage of the
                                         specificity of a
            SkylineMatrix which, e.g.,
                                  should allow

                                         Actually skyline matrices are
                   similar to
                                  banded matrices,
                                         and in particular it seems to
            me that
                   they are
                                  similar enough
                                         to share the same algorithms.
            So for
                                  these two kind
                                         of matrices could inherit the
                   matrix base
                                  and the
                                         algorithms could be
            implemented in a generic
                                  way via the
                                         concept of "range inner
            vector" which is a
                                  column (or a row)
                                         vector with an associated
            start index... We
                                  also need
                                         efficient basic operators
            dealing with such
                                  vector. E.g. the
                                         addition of two such vector
            can be
                                  done by first
                                         copying the non overlapping
            part (start and
                                  end), and then sum
                                         the overlapping part using
            Eigen's core

                                     The way we implemented skyline
            matrix is not
                                  similar to
                                     what is usually done for banded
            matrix, as
                   we store
                                  the diagonal,
                                     the upper, and the lower elements
            in separate
                                  arrays. This way, we
                                     can achieve quite efficient LU
                                  solving) due to
                                     coherent  memory accesses.

                                  ok I see, then it is not really what
            people call
                                  skyline storage
                                  but it is more like an extension of
            the tridiagonal

                              I think that our storage is actually
            quite similar
                   to the
                              one proposed there for non symetric skyline
                   (referred as
                              NSK format by Saad) matrix. As in these
            format, we
                              the upper element in a column-oriented
             way, and
                   the lower
                              element in a row-oriented way.
                              The only difference is that we store the
                              in a separate array instead of keeping
            an index
                   array to
                              these elements. This allows contiguous
            access to

                                  However, I don't see how your
            storage can yield
                   to a
                                  more efficient implementation of the LU
                                  than the standard skyline storage.
            For instance
                                  is a basic algorithm without
            pivoting for the
                                  skyline storage:

                                  for(int k = 0; k+1 < rows; ++k)
                                   int rrows = rows-k-1;
                                   int rsize = size-k-1;

                                   // line 1
                                   lu.col(k).end(rrows) /= lu.coeff(k,k);

                                   // line 2
lu.corner(BottomRight,rrows,rsize).noalias() -=
                                  lu.col(k).end(rrows) *

                                  where lu is the working matrix,
            lu.col(k) is
                                  to return a "range vector" as I
            described in my
                                  previous email.

                                  Here line 1 would be trivially
            optimized (i.e.,
                                  vectorized) since
            lu.col(k).end(rrows) is just
                   a small
                                  dense vector.

                                  Line 2 is an outer product which
            again is
                                  trivially/automatically vectorized
            sicne it is
                                  impelmented as a sequence of:
                   "col_range_vector_i -=
                                  scalar * col_range_vector".

                                  Here the locality is pretty good
            because the vector
                                  "lu.col(k).end(rrows)" which is
            reused multiple
                                  is sequentially stored in memory.

                                  But perhaps there exists a special
            algorithm which
                                  perfectly fit your storage ? Is it
            possible to see
                                  your code somewhere ? Finally, if it
            is really more
                                  efficient then that would make sense
            to have it
                   in Eigen.

                              The benefit of using the NSK, i.e. of
            storing the lower
                              matrix in a column-oriented way, and the
                   matrix in a
                              row-oriented way is that in line 1 we
            access contigous
                              elements (the column k of the lower
            matrix), and in
                   line 2
                              we also access contiguous element (the
            row elements
                   of the
                              upper matrix). I don't know if it's
            efficient for
                              vectorization, but without, it is quite
                   as the
                              locality is very good.


                                     So I don't think that our
            algorithms share
                                  features with
                                     band matrix algorithms. They seem
            closer to what
                                  can be done with
                                     diagonal matrices.

                                     Anyway, I've used the
            SparseMatrix code as a
                                  for my
                                     SkylineMatrix code. Currently it only
                   supports very
                                     operations. I'll implement the LU
                                  soon. I will also
                                     try to use vectorization.


                                         These are just some initial
            thoughts and the
                                  discussion is
                                         very open!


                                         On Mon, Oct 19, 2009 at 12:55 PM,
                   guillaume saupin
            <mailto:guillaume.saupin@xxxxxx>>>>>> wrote:


                                            We are planning to use
            your library
                   in our
                                  projects, but we
                                         need a
                                            skyline matrix. Therefore
            I'd like to
                                  implement one for
                                         eigen, but
                                            I don't now where to start.

                                            Is there a specific class
            that can be
                   a good
                                  starting point /
                                            skeleton to start with ? The
                                  SparseMatrixBase might be a
                                         good choice.
                                            Should this SkylineMatrix
            inherit from
                                  SparseMatrixBase, or
                                         be a
                                            separate class ?



Mail converted by MHonArc 2.6.19+