Re: [eigen] Sparse Arrays for Eigen?

[ Thread Index | Date Index | More Archives ]

Hi Elizabeth,
   Just a quick point.
    I think that you would impress everyone if you wrote up a speed comparison between SpSparse and Eigen::Tensor for increasingly larger Tensor sizes and different degrees of sparsity. That would show the crossover point where SpSparse has greater performance then Eigen::Tensor. Similar to these old performance charts:


On Thursday, January 14, 2016 5:16 AM, Elizabeth Fischer <rpf2116@xxxxxxxxxxxx> wrote:

Hello Christoph,

SpSparse is quite different from Eigen::Sparse in a number of ways, and therefore interesting:

1. I would think of spsparse::Array vs. Eigen::Sparse in the same vein as Eigen::Tensor vs. Eigen::Matrix.  spsparse::Array is desirable for the same reason as Eigen::Matrix.

2. Internally, spsparse::Array is (currently) a coordinate tensor: i.e., internally it has one array per index, plus an array for values.  If nothing else, this is a convenient way to assemble the required triplets that can then be used as a constructor for Eigen::Matrix or Eigen::Vector.  SpSparse is also good at consolidating duplicates in your original list.

3. Algorithms are written on SpSparse arrays by iterating through the elements.  Algorithms are implemented on multiple arrays together by sorting and joining the elements.  This is efficient for many operations, for example sparse-sparse tensor multiplication.  You cannot directly ask a SpSparse array for the (i,j) element.

4. SpSparse makes no effort to offer genera linear algebra.  Algorithms are focused on tensor representation, transformation and multiplication.  Believe it or not, my application requires lots of sparse tensors and tensor multiplication, and no "more sophisticated" linear algebra algorithms.  I think the reasonable thing to do would be to convert a SpSparse array to an Eigen::Matrix if one wishes to do more linear algebra on it.

5. SpSparse has the capability of supporting multiple storage formats underneath: variable-length std::vector, fixed-length through blitz::Array (would be ported to Eigen::Tensor).  Also a single array of triplets (or their rank-n analog) is also a possible format.  Storage format is abstracted away by the iterator interface, allowing algorithms to be written for these multiple use cases.  The Eigen::Tensor underlying is particularly useful because you can make a spsparse::Array out of existing blocks of memory (for example, stuff you read out of a netCDF file).  [Disclaimer: I've only implemented one underlying representation.  The design is for multiple, and a prototype library did just that.  Adding more storage formats will be easy when desired.]

6. CSR and CSC formats are also possible as storage formats.

7. Looking at Eigen's sparse matrix multiplication source code, my guess is that SpSparse does this faster.  SpSparse multiples sparse matrices by sorting LHS row-major and RHS column-major, and then multiplying each row by each column.  Each row-column multiplication is accomplished with a "join" iterator that scans through both sides in order, finding places where indices match.  This is a direct analog to dense matrix multiplication.  SpSparse multiplication would be faster because it has less nested looping.  I realize that the required sorting might not be desirable or possible for all use cases.

SpSparse join iterators can also be used to do many operations at once.  I was able to make a single-pass matrix multiplication that computes:
            C * diag(S1) * A * diag(S2) * B * diag(S3)
...where C is a scalar, S1,S2,S3 are (optional) spsparse::Array<..., 1> and A and B are spsparse::Array<..., 2>.  If this is the kind of product you need to compute, I doubt there is a faster way to do it.

7. SpSparse can also multiply sparse matrices (rank 2 tensors) by sparse vectors (rank 1 tensors) --- something I needed that Eigen did not provide in a direct way.  More complex tensor multiplication would also be possible.

8. I've found that rank-1 SpSpare arrays are useful for many things I never thought of before.

For all these reasons, I think that SpSparse might make a nice addition to Eigen, probably renamed to Eigen::SparseTensor or something.

-- Elizabeth

On Thu, Jan 14, 2016 at 5:18 AM, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Eigen's support for sparse matrices and vectors should be quite good. Have a look at these pages for an overview:
There is also some functionality to let users iterate through non-zero elements themselves.

I don't think there are concrete plans at the moment for sparse arrays (i.e.., that provide element-wise products), nor for multidimensional sparse tensors. If that is what you are looking for, we may consider incorporating this -- could you explain a bit more specific what operations you want?


On 2016-01-13 23:04, Elizabeth Fischer wrote:

I noticed that Eigen is getting support for dense tensors (or arrays, if
you will), but its support for sparse structures is limited to somewhat
specialized matrices and vectors.  I recently wrote a general sparse array
library, and used that to implement efficient sparse-sparse multiplication
(matrix-matrix, matrix-vector, etc).  I use it in my project, where most
data structures can be thought of as a sparse array.  The core idea is to
implement sparse array operations based on iterators that iterate through
the non-zero elements (and to expose this framework to the user).

I'm inquiring whether there might be interest in incorporating this code in
Eigen.  The code as it currently stands is at:

-- Elizabeth

 Dipl. Inf., Dipl. Math. Christoph Hertzberg

 Universität Bremen
 FB 3 - Mathematik und Informatik
 AG Robotik
 Robert-Hooke-Straße 1
 28359 Bremen, Germany

 Zentrale: +49 421 178 45-6611

 Besuchsadresse der Nebengeschäftsstelle:
 Robert-Hooke-Straße 5
 28359 Bremen, Germany

 Tel.:    +49 421 178 45-4021
 Empfang: +49 421 178 45-6600
 Fax:     +49 421 178 45-4150
 E-Mail:  chtz@xxxxxxxxxxxxxxxxxxxxxxxx

 Weitere Informationen:

Mail converted by MHonArc 2.6.19+