Re: [eigen] Sparse Arrays for Eigen?

Christoph,

Thank you for your informative reply. It looks like maybe the Eigen code is ahead of its documentation, in terms of capability. For example:

> You can directly multiply SparseMatrix by SparseVector in Eigen

I see no examples of this at:

http://eigen.tuxfamily.org/dox-devel/group__TutorialSparse.html

> If you have two index arrays and a value array, you can directly use them in an Eigen::SparseView

Where? It's not here:

http://eigen.tuxfamily.org/dox-devel/namespaceEigen.html

> Eigen also supports, CSR, CSC and triplet format (stored as three vectors).

> A single vector of triplets can easily be converted by Eigen. Also, most algorithms

> expect CSR or CSC inputs.

That's really nice functionality. But again... where is it? The following only mentions CSR and CSC:

http://eigen.tuxfamily.org/dox-devel/group__TutorialSparse.html

Had I been able to find this information on the website in a usable way, I probably wouldn't have written SpSparse. The general tensor stuff is nice, but it's not make-or-break for my current application.

> If I understand you correctly, spsparse::Array relates to Eigen::SparseMatrix as

> Eigen::Tensor to Eigen::Matrix, right?

Yes.

> Ok, by tensor multiplication you mean element-wise multiplication or

> something like
> Res(i,j,k) = SUM_l A(i,j,l)*B(i,l,k); (or other variants)?

The latter.

> If I understand you correctly, your library supports both?

It's a toolbox that can be used to easily write multiplication-like operations easily. What I needed was:

C * diag(S1) * A * diag(S2) * B * diag(S3)

... where S1, S2 and S3 are optional. I also needed:

C * diag(scalei) * A * diag(scalej) * V # V is a vector

So I built those two multiplies. Maybe someone could be more clever to write a general n-dimensional tensor multiply template.

-- Elizabeth

On Mon, Jan 18, 2016 at 12:20 PM, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Hi Elizabeth,

sorry for not having answered yet. I try to briefly answer/comment your questions/suggestions below.

On 2016-01-14 14:15, Elizabeth Fischer wrote:

1. I would think of spsparse::Array vs. Eigen::Sparse in the same vein as
Eigen::Tensor vs. Eigen::Matrix. spsparse::Array is desirable for the same
reason as Eigen::Matrix.

If I understand you correctly, spsparse::Array relates to Eigen::SparseMatrix as Eigen::Tensor to Eigen::Matrix, right?

2. Internally, spsparse::Array is (currently) a coordinate tensor: i.e.,
internally it has one array per index, plus an array for values. If
nothing else, this is a convenient way to assemble the required triplets
that can then be used as a constructor for Eigen::Matrix or Eigen::Vector.
SpSparse is also good at consolidating duplicates in your original list.

If you have two index arrays and a value array, you can directly use them in an Eigen::SparseView and Eigen can already convert that to CSC/CSR or to dense matrices. Of course, this only works for two dimensions at the moment..

3. Algorithms are written on SpSparse arrays by iterating through the
elements. Algorithms are implemented on multiple arrays together by
sorting and joining the elements. This is efficient for many operations,
for example sparse-sparse tensor multiplication. You cannot directly ask a
SpSparse array for the (i,j) element.

Ok, by tensor multiplication you mean element-wise multiplication or something like
Res(i,j,k) = SUM_l A(i,j,l)*B(i,l,k); (or other variants)?

If I understand you correctly, your library supports both?

4. SpSparse makes no effort to offer genera linear algebra. Algorithms are
focused on tensor representation, transformation and multiplication.
Believe it or not, my application requires lots of sparse tensors and
tensor multiplication, and no "more sophisticated" linear algebra
algorithms. I think the reasonable thing to do would be to convert a
SpSparse array to an Eigen::Matrix if one wishes to do more linear algebra
on it.

I guess, it should rather be converted an Eigen::SparseMatrix?

5. SpSparse has the capability of supporting multiple storage formats
underneath: variable-length std::vector, fixed-length through blitz::Array
(would be ported to Eigen::Tensor). Also a single array of triplets (or
their rank-n analog) is also a possible format. Storage format is
abstracted away by the iterator interface, allowing algorithms to be
written for these multiple use cases. The Eigen::Tensor underlying is
particularly useful because you can make a spsparse::Array out of existing
blocks of memory (for example, stuff you read out of a netCDF file).
[Disclaimer: I've only implemented one underlying representation.. The
design is for multiple, and a prototype library did just that. Adding more
storage formats will be easy when desired.]

6. CSR and CSC formats are also possible as storage formats.

Eigen also supports, CSR, CSC and triplet format (stored as three vectors). A single vector of triplets can easily be converted by Eigen. Also, most algorithms expect CSR or CSC inputs.

7. Looking at Eigen's sparse matrix multiplication source code, my guess is
that SpSparse does this faster. SpSparse multiples sparse matrices by
sorting LHS row-major and RHS column-major, and then multiplying each row
by each column. Each row-column multiplication is accomplished with a
"join" iterator that scans through both sides in order, finding places
where indices match. This is a direct analog to dense matrix
multiplication. SpSparse multiplication would be faster because it has
less nested looping. I realize that the required sorting might not be
desirable or possible for all use cases.

I'm not sure if that is really faster (except for special cases) -- I assume you'll end up with much more branching.
Actually, I once thought the same when calculating a J^T*J product, but it turned out that transposing J and and doing a classical sparse multiplication was faster (I did not check your implementation, so maybe I introduced some overhead in my implementation, back then).

SpSparse join iterators can also be used to do many operations at once. I
was able to make a single-pass matrix multiplication that computes:
C * diag(S1) * A * diag(S2) * B * diag(S3)
....where C is a scalar, S1,S2,S3 are (optional) spsparse::Array<..., 1> and
A and B are spsparse::Array<..., 2>. If this is the kind of product you
need to compute, I doubt there is a faster way to do it.

That might be faster in some cases (i.e., depending on the structure of A, B), though I wouldn't trust my gut-feeling on that either.
But since Eigen has no optimization for sparse diagonal matrices, this is not unlikely to be faster with SpSparse.

7. SpSparse can also multiply sparse matrices (rank 2 tensors) by sparse
vectors (rank 1 tensors) --- something I needed that Eigen did not provide
in a direct way. More complex tensor multiplication would also be possible.

You can directly multiply SparseMatrix by SparseVector in Eigen

8. I've found that rank-1 SpSpare arrays are useful for many things I never
thought of before.

There are SparseVectors in Eigen, if you need them.

For all these reasons, I think that SpSparse might make a nice addition to
Eigen, probably renamed to Eigen::SparseTensor or something.

Ok, I guess essentially what is missing in Eigen and what SpSparse provides are SparseTensor i.e. multidimensional sparse Arrays.
What could be added (relatively) easy are element-wise products for sparse matrices -- if that is in fact needed.
For performance improvements, I'd like to see benchmark results first.

Regards,
Christoph

--
Dipl. Inf., Dipl. Math. Christoph Hertzberg

Universität Bremen
FB 3 - Mathematik und Informatik
AG Robotik
Robert-Hooke-Straße 1
28359 Bremen, Germany

Zentrale: +49 421 178 45-6611

Besuchsadresse der Nebengeschäftsstelle:
Robert-Hooke-Straße 5
28359 Bremen, Germany

Tel.: +49 421 178 45-4021
Empfang: +49 421 178 45-6600
Fax: +49 421 178 45-4150
E-Mail: chtz@xxxxxxxxxxxxxxxxxxxxxxxx

Weitere Informationen: http://www.informatik.uni-bremen.de/robotik