Re: [eigen] Skyline matrix

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Skyline matrix
From: guillaume saupin <guillaume.saupin@xxxxxx>
Date: Mon, 16 Nov 2009 11:06:20 +0100

Gael Guennebaud wrote:

On Thu, Nov 12, 2009 at 1:04 PM, guillaume saupin<guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>> wrote:


    Gael Guennebaud wrote:


        Ok, I've put it in the unsupported/ folder. To use it you just
        have add path/to/eigen2/unsupported to your set of include paths.

        Since I guess you will certainely want to do some
        modifications in the future, I recommend you to create an
        account on bitbucket, and I will give you write access. Feel
        free to commit any changes you like in this module, but if
        your changes affect some other modules, please let's first
        dicuss them here !

    I've just created my account on bitbucket. My id is simply
    guillaume.saupin.


        Now I have to admit there are still a lot of work to do before
        having something really usable to offer to other users. In my
        opinion, the SkylineMatrix class should be taillored to be
        optimal for the two following tasks:
        1 - setup/assemble the matrix
        2 - build efficient high-level algorithms

        Both tasks require a powerful API taking into account the
        specificities of the skyline storage. In particular this
        discard low level coeff(i,j) methods or things like that.
        Iterators are not very well suited here because 1) you need
        various kind of iterators (upper, lower), and 2) they cannot
        take advantage of the sequential storage to enable unrolling
        and vectorization.

        So, for the matrix assembly, the challenge is to reduce the
        need for memory reallocation and memory copies. Here we first
        need a very low level API allowing to fill the matrix with
        constant time insertions.

    One possibility is to let the user specify the profile for each
    rows and cols, and allocate the memory accordingly. Then the
    insertion can be done in constant time.


        Then to offer more flexibility to the user, perhaps the
        simpler is to let the user fill a {Dynamic}SparseMatrix, and
        then implement a fast conversion method. The latter is a must
        to have feature anyway.

    If we provide an easy and fast API to fill the matrix, I'm not
    sure that it will be very useful to create a SkylineMatrix from a
    SparseMatrix.


        In order to easily implement efficient algorithms (products
        and solvers), the SkylineMatrix should provide higher level
        access functions than the coeff() function or the iterators.
        In particular I'm thinking about the following set of functions:

        diagonal(); // return a vector expression of the diagonal
        lowerCol(j);    // returns an expression of the j-th column of
        the strictly lower triangular part
        upperRow(i);
        centeredBlock(i, rows, cols); // returns an expression of the
        rows x cols submatrix at the coeff (i, i)

        The first two functions would return a special expression
        wrapping a dense VectorBlock plus a start index and a length
        with optimized arithmetic operators. Such an expression, let's
        call it a RangeVector could also be useful for the Band storage.

    This is a good idea. What can I use as a starting point to
    implement those VectorBlocks ? Should it inherit from an already
    existing class ?

For diagonal(), it is easy. Just use a Matrix<Scalar,Size,1> to storeit (instead of a raw pointer), and let it return a ref to it.

For the lowerCol(), upperRow() we have to create a new matrix type,well actually a vector type inheriting AnyMatrixBase<> with supportfor basics operations. However, I think it is safer to wait a bit thatI've done the change we discussed on a recent thread about somerefactoring of the expression template mechanism. Once this changedone, it should be much simpler to add new special matrices. Hopefully.

Ok. No problem. I'll wait.

A quick question: I did not read your source files carefully, but Ithought that this SkylineMatrix stores the lower triangular usingcolumn-major vectors, and the upper part using row-major vectors.Right ? Then what is the purpose of the StorageOrder option ?

This is indeed useless. In fact, I have initially implemented a supportfor column-major vectors for the lower part, and row-major vectors forthe upper part, as well as row-major vectors for the lower part, andcolumn-major vectors for the upper part, but I realize that it wasuseless since our algos perform better on the first storage type.


    Then writting a  triangular solver, a LU decomposition, a Cholesky
    decomposition, etc. would be almost trivial. Actually, if we add
    the concept of matrix partitionner as I described in another
    thread, one could write the exact same code for both a dense or a
    skyline matrix !

    I'm also thinking whether it could make sense to make the
    "bandwidth" of the matrix a template parameter. The default would
    be "Dynamic". This would allow to optimize the code for matrices
    having a small fixed size bandwidth.

    This could be a good idea. What kind of optimization can be done
    on such SkylineMatrices ?

Mainly loop unrolling of the most inner loops. To make that possible,the "VectorRange" discussed above should be able to deal with compiletime ranges.

Does it require memory alignment ?

gael.



    Thanks,

    guillaume

        gael.




        On Mon, Nov 9, 2009 at 4:30 PM, guillaume saupin
        <guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>
        <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>> wrote:

           Hi !

           Here comes the cleaned code for the skyline matrix. If you
        could
           integrate it inside Eigen, it would be perfect.

           thanks,

           guillaume



           guillaume saupin wrote:

               Gael Guennebaud wrote:



                   On Thu, Nov 5, 2009 at 2:40 PM, guillaume saupin
                   <guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
        <mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>> wrote:

                      Gael Guennebaud wrote:

                          Hi,

                          thanks for the patch !

                          I'd be ok to have it in Eigen. In a first
        step I would
                          probably put it in the unsupported/ set of
        modules
                   because it
                          sems quite preliminary for now.

                      No problem. This is perfect if we can access this
                   Matrix from
                      eigen. I've written some unit tests with boost.
        Where
                   should I put
                      them ?


                          However before we can apply the patch some
        cleaning is
                          required: at the very least the copyrights
        have to
                   be updated.
                          It would also be nice to remove all the
        commented
                   stuff.

                          Then we can discuss about the usefulness of
        having a
                          SkylineMatrixBase. Indeed, such a base class
        can be
                   needed for
                          two reasons:

                      The truth is that I have not really thought
        about that.
                   I just
                      copy/paste the SparseMatrix code ;)


                          1 - To implement the expression template
        paradigm
                   for basics
                          operations (+, -, scaling, coefficients wise
                   operations like
                          abs, sin, cos etc.). However, here I'm not sure
                   that makes a
                          lot of sense. It seems to me that a such a
                   specialized matrix
                          is only usefull for solving problems, so
        supporting
                   such
                          operators is not really needed, right ?

                      Our main use for SkylineMatrix is for solving
        systems.
                      Nevertheless we will soon use at least a Skyline *
                   Dense product,
                      and probably a Dense * Skyline product. So template
                   expressions
                      may be required to do these products without
                   temporaries and with
                      a simple syntax. But it is maybe possible to do that
                   without a
                      Base class ?



                   Yes and no, because actually matrix products are
        special
                   in the sense that they must be directly evaluated.
        On the
                   other hand ET might still be useful to allow for more
                   complex products involving transpose, negate and
        scaling ops.

               I will need this kind of complex products, especially those
               involving transpose, in my ColumnMatrix. Do you think
        that it
               must inherit MatrixBase to benefit from ET ?

                   The problem is that our current design does not
        allow to
                   reuse our Transpose/CwiseUnaryOp/etc. expression
        classes
                   for something else than dense matrices. So I'm more and
                   more convinced that our ET mechanism should be
        updated to
                   be more flexible. But that's another topic !

                          2 - Another reason would be to have various
                   SkylineMatrix
                          storage variants which would be sufficiently
                   different to
                          require different classes, but close enough to
                   share the same
                          (solver) algorithms. Again, I'm not sure
        that's the
                   case here.

                      Right. I don't think too that we'll introduce
        another
                   storage.


                          Nevertheless, it's still fine to keep it for
        now,
                   and wait a
                          bit for more use cases to see what's the
        best strategy.


                      Thanks anyway for your advises. I'll clean the
        code and
                   send it
                      back to you.


                   ok, cool.

                   gael.

gael.




                          On Wed, Nov 4, 2009 at 3:24 PM, guillaume saupin
                          <guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>>> wrote:

                             You'll find enclosed the code as a patch.


                             guillaume saupin wrote:

                                 Hello !

                                 I've finally (after some performances
        tests)
                   implemented a
                                 basic skyline matrix using the following
                   storage :

                                 - an array stores the diagonal elements
                                 - an array stores the upper elements
        in a column
                          oriented way
                                 - an array stores the lower elements in a
                   row oriented way

                                 I've implemented a LU decomposition
        for this
                   format. I
                          use the
                                 following algorithm (with minor
        modification
                   to handle the
                                 sparseness of the matrix):

                                 for each row
                                 -- compute the lower part
                                   for each col < row
                                      LU(row, col) =  (LU(row, col)  -
        sum(k
                   = 0 ; k < col)
                                 [LU(row, k)  * LU(k, col)] ) /
        LU(row, row)
                   -- elements are
                                 access contiguously / Can be easily
        vectorize
                                 -- compute the upper part
                                 for each rrow < row
                                   LU(rrow, row) -= sum(k = 0 ; k < rrow)
                   [LU(rrow, k)
                           * LU(k,
                                 row)] -- elements are access
        contiguously /
                   Can be easily
                                 vectorize
                                 -- compute the diag part
                                 LU(row, row) -= sum(k = 0 ; k < row)
                   [LU(rrow, k)  * LU(k,
                                 row)] -- elements are access
        contiguously /
                   Can be easily
                                 vectorize

                                 This algorithm is well suited for our
                   storage format.
                          After a
                                 few tests, it seems even more
        efficient to
                   store the
                          the upper
                                 and lower elements in the same array,
        with
                   the elements
                          from
                                 one row followed by the elements of
        the cols
                   of same index,
                                 and so on ...

                                 I've also implemented a product for the
                   Skyline *  Dense
                                 operation.

                                 Do you think that it would be possible to
                   insert this code
                                 into eigen ?

                                 guillaume saupin wrote:

                                     Gael Guennebaud a écrit :



                                         On Tue, Oct 27, 2009 at 10:40 AM,
                   guillaume saupin
                                         <guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>>>> wrote:

                                            Gael Guennebaud a écrit :

                                                this is not an easy
        question
                   because we are
                                         still unsure how
                                                to manage efficiently and
                   with minimal
                          code all
                                         these kind of
                                                "special" matrices.

                                                Of course an easy solution
                   would be to
                          make it
                                         inherit
                                                SparseMatrixBase and
        let the
                   generic
                          algorithms
                                         based on
                                                sparse iterators do
        the job.
                   However, this
                                         approach won't be
                                                optimal because the
        iterators
                   won't take
                                         advantage of the
                                                specificity of a
                   SkylineMatrix which, e.g.,
                                         should allow
                                                vectorization.

                                                Actually skyline
        matrices are
                   very
                          similar to
                                         banded matrices,
                                                and in particular it
        seems to
                   me that
                          they are
                                         similar enough
                                                to share the same
        algorithms.
                   So for
                          instance
                                         these two kind
                                                of matrices could
        inherit the
                   same
                          matrix base
                                         and the
                                                algorithms could be
                   implemented in a generic
                                         way via the
                                                concept of "range inner
                   vector" which is a
                                         column (or a row)
                                                vector with an associated
                   start index... We
                                         also need
                                                efficient basic operators
                   dealing with such
                                         vector. E.g. the
                                                addition of two such
        vector
                   can be
                          efficiently
                                         done by first
                                                copying the non
        overlapping
                   part (start and
                                         end), and then sum
                                                the overlapping part using
                   Eigen's core
                          module
                                         feature.


                                            The way we implemented skyline
                   matrix is not
                          really
                                         similar to
                                            what is usually done for
        banded
                   matrix, as
                          we store
                                         the diagonal,
                                            the upper, and the lower
        elements
                   in separate
                                         arrays. This way, we
                                            can achieve quite efficient LU
                   decomposition
                          (and
                                         solving) due to
                                            coherent  memory accesses.


                                         ok I see, then it is not
        really what
                   people call
                                         skyline storage

(http://www.netlib.org/linalg/html_templates/node96.html),

                                         but it is more like an
        extension of
                   the tridiagonal
                                         storage.

                                     I think that our storage is actually
                   quite similar
                          to the
                                     one proposed there for non
        symetric skyline
                          (referred as
                                     NSK format by Saad) matrix. As in
        these
                   format, we
                          store
                                     the upper element in a
        column-oriented
                    way, and
                          the lower
                                     element in a row-oriented way.
                                     The only difference is that we
        store the
                   diagonal
                          elements
                                     in a separate array instead of
        keeping
                   an index
                          array to
                                     these elements. This allows
        contiguous
                   access to
                          diagonal
                                     elements.


                                         However, I don't see how your
                   storage can yield
                          to a
                                         more efficient implementation
        of the LU
                          decomposition
                                         than the standard skyline
        storage.
                   For instance
                          here
                                         is a basic algorithm without
                   pivoting for the
                          standard
                                         skyline storage:

                                         for(int k = 0; k+1 < rows; ++k)
                                         {
                                          int rrows = rows-k-1;
                                          int rsize = size-k-1;

                                          // line 1
                                          lu.col(k).end(rrows) /=
        lu.coeff(k,k);

                                          // line 2

lu.corner(BottomRight,rrows,rsize).noalias() -=

                                         lu.col(k).end(rrows) *
                   lu.row(k).end(rsize);
                                         }

                                         where lu is the working matrix,
                   lu.col(k) is
                          assumed
                                         to return a "range vector" as I
                   described in my
                                         previous email.

                                         Here line 1 would be trivially
                   optimized (i.e.,
                                         vectorized) since
                   lu.col(k).end(rrows) is just
                          a small
                                         dense vector.

                                         Line 2 is an outer product which
                   again is
                                         trivially/automatically
        vectorized
                   sicne it is
                                         impelmented as a sequence of:
                          "col_range_vector_i -=
                                         scalar * col_range_vector".

                                         Here the locality is pretty good
                   because the vector
                                         "lu.col(k).end(rrows)" which is
                   reused multiple
                          times
                                         is sequentially stored in memory.

                                         But perhaps there exists a
        special
                   algorithm which
                                         perfectly fit your storage ?
        Is it
                   possible to see
                                         your code somewhere ?
        Finally, if it
                   is really more
                                         efficient then that would
        make sense
                   to have it
                          in Eigen.

                                     The benefit of using the NSK, i.e. of
                   storing the lower
                                     matrix in a column-oriented way,
        and the
                   upper
                          matrix in a
                                     row-oriented way is that in line 1 we
                   access contigous
                                     elements (the column k of the lower
                   matrix), and in
                          line 2
                                     we also access contiguous element
        (the
                   row elements
                          of the
                                     upper matrix). I don't know if it's
                   efficient for
                                     vectorization, but without, it is
        quite
                   efficient,
                          as the
                                     locality is very good.



                                         gael.

                                            So I don't think that our
                   algorithms share
                          common
                                         features with
                                            band matrix algorithms.
        They seem
                   closer to what
                                         can be done with
                                            diagonal matrices.

                                            Anyway, I've used the
                   SparseMatrix code as a
                          base
                                         for my
                                            SkylineMatrix code.
        Currently it only
                          supports very
                                         basic
                                            operations. I'll implement
        the LU
                   decomposition
                                         soon. I will also
                                            try to use vectorization.


                                            guillaume

                                                These are just some
        initial
                   thoughts and the
                                         discussion is
                                                very open!

                                                gael

                                                On Mon, Oct 19, 2009
        at 12:55 PM,
                          guillaume saupin

<guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>

<mailto:guillaume.saupin@xxxxxx <mailto:guillaume.saupin@xxxxxx>

                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>
                          <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>
                   <mailto:guillaume.saupin@xxxxxx
        <mailto:guillaume.saupin@xxxxxx>>>>>>> wrote:

                                                   Hello,

                                                   We are planning to use
                   your library
                          in our
                                         projects, but we
                                                need a
                                                   skyline matrix.
        Therefore
                   I'd like to
                                         implement one for
                                                eigen, but
                                                   I don't now where
        to start.

                                                   Is there a specific
        class
                   that can be
                          a good
                                         starting point /
                                                   skeleton to start
        with ? The
                                         SparseMatrixBase might be a
                                                good choice.
                                                   Should this
        SkylineMatrix
                   inherit from
                                         SparseMatrixBase, or
                                                be a
                                                   separate class ?

                                                   Thanks,

                                                   guillaume

Follow-Ups:
- Re: [eigen] Skyline matrix
  - From: Gael Guennebaud

References:
- Re: [eigen] Skyline matrix
  - From: guillaume saupin
- Re: [eigen] Skyline matrix
  - From: guillaume saupin
- Re: [eigen] Skyline matrix
  - From: Gael Guennebaud
- Re: [eigen] Skyline matrix
  - From: guillaume saupin
- Re: [eigen] Skyline matrix
  - From: Gael Guennebaud
- Re: [eigen] Skyline matrix
  - From: guillaume saupin
- Re: [eigen] Skyline matrix
  - From: guillaume saupin
- Re: [eigen] Skyline matrix
  - From: Gael Guennebaud
- Re: [eigen] Skyline matrix
  - From: guillaume saupin
- Re: [eigen] Skyline matrix
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Skyline matrix
Next by Date: Re: [eigen] Skyline matrix
Previous by thread: Re: [eigen] Skyline matrix
Next by thread: Re: [eigen] Skyline matrix

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/