Re: [eigen] Iterators with dense matrices: (was: Areas of eigen3 variat

On Mon, Jan 4, 2010 at 9:23 AM, Hauke Heibel <hauke..heibel@xxxxxxxxxxxxxx> wrote:

On Mon, Jan 4, 2010 at 2:59 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:

Thanks a lot for the explanation.

Is this completely specific to uBlas or are there other Boost
libraries standardizing on the same interface? uBlas alone is not

A few general thoughts on this stuff.

Iteration on vectors is a no-brainer... follow the standard library.
I would say that iterators for matrices (not vectors) in uBlas are very counterintuitive and not really proper iterators. They are more like cursors through the matrix (to be discussed later).
Iteration through matrices is much more important for sparse algorithms than dense ones. The issue there is skipping non-zeros rather than vectorization of operations.
Many on the ublas list are unhappy with the naming, etc. for the matrix iterators, so I wouldn't try to achieve consistency
The iteration approach in ublas makes it very hard to integrate with multi-dimensional array/tensor libraries since it is tied to 2 dimensions. boost::multi_array has a differerent approach.
It seems to me that MTL4 has a better iteration/cursor approach than ublas and may be a better model to match.
The GLAS (an intended generic interface for linear algebra with various backends) guys and MTL guys are sometimes talking so looking at those is a good idea. Though GLAS has a long way to go of course, and may never be.

So my suggestions would be:

1) Iteration for Vectors/slices of matrices/etc.:

To me this is the first-order concern and prevents me from using Eigen. I have a lot of code that uses vectors and multi_arrays for interpolation, etc. and I can't live without std::binary_search on the vector to bracket results efficiently
The specifications for this are easy: http://www.sgi.com/tech/stl/RandomAccessContainer.html
I think it is reasonable to drop a couple of the requirements on the Concept here, but I am not sure if it is necessary.
If it was possible to patch this to Eigen2, then I might be able to convert over.

A few 2nd order comments on this:

Since the standard library will have ranges soon enough to make the libraries more usable, I think that it may be worth designing ahead to ensure range compatibility. This may involve little or no change to the interfaces as ranges are intended to be orthogonal http://www.boost.org/doc/libs/1_41_0/libs/range/doc/range.html#bidirectional_range
Also I use std::transform all over the place... it would be great if there was an easy Eigen equivalent: eigen::transform(my_eigen_container, my_eigen_container_out, my_unary_functor()). If this can use your reduction methods and vectorization, that would be really awesome.

2) Separate Iterators vs. Cursors

With ublas, you don't really have iterators for matrices. It is more like a cursor through the matrix. If go:

auto iter1 = get matrix.begin1();

This is not an iterator on the rows on the matrix contrary to the docs. Dereferencing this iterator gives you the value of the current "cursor" which would be the first element of the row at first.

The intention of the begin1(), etc. is to nest iterators for traversing the matrix (see pauls code), so they are useless on their own as far as I can tell. This caused me much grief. When I was forced to traverse through a dense matrix, I ended up just accessing with "(i, j)" because of frustractions.

I think this is more of a cursor through the matrix. I think that MTL4 should be the winner for what interface to emulate: http://www.osl.iu.edu/research/mtl/mtl4/doc/iteration.html

3) Cursors for Sparse Matrices

This is getting above my pay grade... I hope Paul or someone who does a lot of sparse algorithms can correct it.

Where cursors/iterators are essential is with sparse algorithms. The reason is that you need to go through only the non-zero elements. This can't really be vectorized in the same was as the other Eigen routines, and I don't know of any other way to do it than the cursor or iterator approach. Since the MTL guys are mostly interested in sparse algorithms, I think that their approach is likely the right one.

4) Iterators for Matrices and Tensors

Assuming that Eigen moves towards having multidimensional structures, I think that the iteration should probably follow a model like boost::multi_array, except provide the normal linear algebra interfaces when the dimension drops to 2 or 1.

In multi_array, there is naturally no "begin1, begin2". As an example for a 3d structure like: boost::multi_array<double, 3>

I believe what happens when I call "begin()" is that it provides an iterator through the first dimension of the boost::multi_array<double, 3>
If I dereference this, I get some something like: multi_array_ref<double, 2>, dropping the leading dimension.
I can then get an iterator on this which dereference to a multi_array_ref<double, 1>.
This then acts as a STL compliant random access container. i.e. if I call begin() on this, it finally will dereference to the value_type, a "double" here.

This is exactly the sort of iteration that you need for many recursive algorithms in multi-dimensions. For example, I need it for a multi-dimensional cubic spline

(Here is someones implementation here uses nested std::vectors and recursively gets iterators. http://quantlib.svn.sourceforge.net/viewvc/quantlib/trunk/QuantLib/ql/math/interpolations/multicubicspline.hpp?view=markup

Of course, it would be much nicer to be able to get iterators over something other than just the trailing dimension. e.g.

Eigen::multi_array<double, 3> arr;

multi_array_ref<double, 2> arr.begin<1>(); //First dimension, compile time.

multi_array_ref<double, 2> arr.begin<tag::row>(); //row is a synonym for 1st dim

multi_array_ref<double, 2> arr.begin<tag::major>(); //iterates over the major dimension, by storage ordering/complexity.

//and then you can even give the option of traversing all vs. non-zeros for sparse matrices.

multi_array_ref<double, 2> arr.begin<tag::major, tag::non_zeros>(); //iterates over the major dimension, by complexity.

.... note that this multi_array_ref here should behave like a eigen::matrix concept. This alone would be amazing as there is no interoperability between boost::multi_array and boost::ublas, much to my chagrin.

This all may sound tricky and unncessary until multi_array's are available, but I think one needs to think through the interface and how it would be generalized to future-proof the iterator interface for matrices/vectors.

Thanks,

Jesse