Re: [eigen] On a flexible API for submatrices, slicing, indexing, maskin

On Fri, Dec 23, 2016 at 4:05 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:

On Thu, Dec 22, 2016 at 5:29 PM, Yuanchen Zhu <yuanchen.zhu@xxxxxxxxx> wrote:
Just want to throw this out to see if people are interested.

Thanks for sharing. Would it be ok for you to share your code?

I have made the stripped down code available at: https://github.com/yuanchen-zhu/eig

It probably won't compile since I might have missed a few namespace references as I was doing the stripping.

Besides adding a unified subblock accessing syntax, it has a bunch of other things that I'd love to see to be in Eigen (though realistically some probably will never be adopted). See the README for details.

The main implementation file for subblock access is "plugins/eigen_densebase_plugin.hpp" and "eig.hpp"

The main trick worth mentioning is that first(2) produces a struct, whereas first<2> produces a (templated) function pointer.. By dispatching based on types, I can support both, as I hate to write first<2>(). I learned this trick from the recent C++ proposal on using in_place_t for optionals and variants.

In a similar veine, I've used a C++14 feature to shortly write integral constant as c<13>:

template<int X> struct Index_c {
static const int value = X;
operator int() const { return value; }
};

template<int X>
static const Index_c<X> c{};

but that's c++14 only and you cannot have both c<compile_time> and c(runtime), so your trick seems to be more flexible, though more tricky on the implementation side. Is it C++11 compatible?

Not sure about C++11. I mainly use the <type_traints> header.. I am not aware of c++14 specific features that I'm using.

I'm currently experimenting with an API like:

range(start,stop); // step==1
range(start,stop,step); // run-time step
range(start,stop,c<STEP>); // compile-time step

span(start,len); // step==1
span(start,len,step); // run-time step
span(start,len,c<STEP>); // compile-time step
span(start,c<LEN>); // compile-time length and step==1
span(start,c<LEN>,c<STEP>); // compile-time length and step
span(start,c<LEN>,step); // compile-time length and runtime step

And the usage remains the same, e.g.:

B = A(range(...), span(...));

Some remarks:

The key advantage here is that the argument order never change! For the "range" case, it would be ok to write range<STEP>(start,stop), but for the "span" case since the length needs also to be defined at compile-time this is unmanageable.

Another advantage compared to the demo on the wiki is that the "bounds-based" and "length-based" variants are similar, no odd API like the iota(len) stuff... Of course, this is also a drawback because there might be some naming confusions between 'range' versus 'span'. It might not be 100% obvious that one is based on 'bounds' and the other on a 'length', but here is the rationale:
- 'range' is (for me) more related to the notions of interval, limits, gamut, etc. that are naturally defined by their 'bounds'.
- 'span' is related to the notion of period of time, distance, width, extent, etc. and thus the notion of 'length' here.

Compared to the demo on the wiki page, here the 'step' is moved to the last argument. This is not matlab friendly, but in c/c++ optional arguments go last, so this makes more sense.

Another issue is that this approach is very compact only if we accept to define a Eigen::c and that the user import it in its current scope and use c++14 (perhaps c++11 with Yuanchen trick?). Otherwise it can become as verbose and unreadable as:

Eigen::span(start, Eigen::Index_c<LEN>(), Eigen::Index_c<STEP>())

Finally, we also have to decide whether the 'stop' argument should be an inclusive or an exclusive upper bound... To figure this out, I'll prepare a set of examples to see what's the most convenient. My intuition is that even though we are used of the STL's exclusive 'end', an inclusive upper bound would be more symmetric with the inclusive lower bound, and thus indexing from the end should be easier...

OK, one more: with this approach we could easily enable compile-time start/stop with range to figure out the length at compile time: range(c<START>, c<STOP>) , but I don't really see the needs for it as IMO if the size can be known at compile-time, then you probably better know it than the bounds, especially if you have to think about whether the upper bound is inclusive or exclusive.

What do you all think about it?

gael