Re: [eigen] On a flexible API for submatrices, slicing, indexing, masking, etc. |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
- much more costly on the compiler (need to parse a char* using compile-time meta programming)I believe integer user literals are fed a long long instead of char* by the compiler.
A more subtle but more important question though, is whether the "stop" argument in range(start,stop) should be inclusive (=last, as in MatLab, symmetric with the start argument) or exclusive (=end, as in C++ STL). I'm tempted to prefer the inclusive variant but since the exclusive one is more standard in C++, I'm puzzled...Yeah. I think we should use (first, last) convention, since those are used by the technical computing community in general. Even boost::range's irange use the (first, last) convention. Also, when negative step comes into question, do you want to write range(n-1, -1, step<-1>) as opposed to the simpler range(n-1, 0, step<-1>), which is also symmetrical to range(0, n-1, step<1>).
- The word "range" is overloaded much more than span. There's boost::range, the WIP range v3, and the "range-based" for loops. These big shots can have different conventions, some times [first, last) (range-based for), some times [begin, end) (boost's irange). For span, the only conflict is gsl::span right now (which should really be named array_view anyway, akin to std::string_view. It also takes in (base_ptr, length), so it's categorically different from our use).
- Thorough Eigen we use the word "size", so we should probably use Size<N>(n). Note the upper case "S" to convey that Size does not return the size of an object (as std::size), but rather constructs an object representing a size. Same for "step" -> "Step". This also reduces name collisions.Yes I understand Eigen uses "size". I have some slight reservations regarding using size here. As a member function name, it's fine. But as a global function, it reminds me too much of std::size(), and even sizeof(). When I see Size(n), my first mental response is always that it refers to the size of n.
There is also the lesser issue where matlab's definition of size is [height, width], but the Eigen already adopted the C++ convention so this issue is moot. The word "len" conveys the idea that it is 1D, e.g., string::length.
Alternatives are "count", "extent", "length" (spelled in full). I used "len" since it is the shortest.I actually liked the word "count" a lot, since it answers a small question at the back of my head: when step is >1, does the length parameter refer to the number of elements in the range, or the number of indices the range straddles across, i.e. (last - first + 1). "Count" unequivocally indicates the former, whereas "size", "extent", and "length" can be interpreted both ways. Another example of API with similar distinction is GL/Vulkan, where "size" is usually size in bytes, "extent" refers to (last - first + 1), and "count" is the number of elements.So I'll use "count" from now on.
Regarding capitalization, I kind of feel that "step" and "count" should be in lower case. Mixing extra capital letters in a short range _expression_ adds syntax noises. An _expression_ like M(all, span(1, count(10)) looks cleaner than M(all, span(1, Count(10))) to me. Also "step" is likely implemented as an overloaded free-standing function, similar to span. Free-standing functions in Eigen starts with lower case by convention, right? If you type "range" and "all" in lower case, why is "count" capitalized? Of course, you could argue that it is a special naming convention for "named argument", but I hate to memorize new conventions...
Actually, this leads to another of my axe to grind. :) I so much prefer the STL/Boost naming convention where basically everything is in snake_case, except (future) concepts and template argument are in PascalCase. Maybe add an suffix "_" for non-public member variables, and macros are of course still ALL_UPPERCASE, but that's all the rules. With a library that uses lots of generic programming, so many templated types are used as compile-time functions or variables. Distinguishing them as types using PascalCase just feels a bit wrong. Also, long descriptor identifier names (like MaxRowsAtCompileTime) look so much better and cleaner in snake case (max_rows_at_compile_time), especially when you come back to read old code months or years later. Anyway I can probably write an essay on this, so I'll stop the rambling (till Eigen 4.0 or std::eigen or boost::eigen, maybe? ;)
Another random idea: it is possible, using operator= overloading, to have the following syntax for named parameter:M(all, span(1, count = 10))See e.g., BOOST_PARAMETER_NAME. I kind of like this more than M(all, span(1, count(10))), though in this case, we cannot use the same identifier for a static count, since count is a variable and count<C> won't compile. Maybe span(1, count_ = 10) and span(1, count<10>)?
- I think we could allow unnamed steps without much ambiguity: range(i, j, s) and range(i, Size(n), s), in particular the second version is unambiguous.I totally agree.
We may or may not want support static indices. If we do, i and j can be supplied as compile time constants using the user suffix syntax, e.g., 3_c, as you described before. Alternatively, we can have the rule that indices can be supplied as template parameters to span. Consider:V(span<3, 6>) vs V(span(3_c, 6_c))V(span<3, 6>(step(s)) vs V(span(3_c, 6_c, step(s)))V(span<3, 6>(step<2>)) vs V(span(3_c, 6_c, step<2>))V(span<4>(len(n)) vs V(span(4_c, len(n))V(span<6>(len<8>, step(s))) vs V(span(6_c, len<8>, step(s));All except the first row look too noisy to me, so maybe we should just not support static indices...If we go with the span/len/step (or rather range/Size/Step) approach and want compile-time indices, then the left column seems preferable to me. I see different use cases:1- for sparse matrices, knowing that start=0 or start=last (with step=-1) is useful. For this we could imagine something else:range("",j)range(Eigen::last,j,Step<-1>)We need a Eigen::last or Eigen::end "keyword" anyway to index relatively from the end. Using "" as a default start=0 might sounds odd, but it has the advantage of not having to introduce one more "keyword" Eigen::zero or Eigen::first.I feel strongly that using "" to mean "begin" or "all" is a step towards obscurity. It's like those operator abusing math libraries where "&" means dot product and "^" means cross product. It feels clever when writing, but hampers compression when you come back to read it months later. Hell, if we had to commit that sin, I feel using the underscore "_" would be a better replacement for "all", i.e., M(_, i) = N(_, i).
I currently don't have a better proposal if we want to support static index and dynamic count/step... Need to think about it.2 - Knowing the start at compile-time might also help to compute alignment, especially in the common case of start==0, which can be handled as in the previous point. Of course, in theory for that purpose we are rather interested in knowing whether start is a multiple of some power of two numbers... but that's too far fetched!3 - Knowing both start and stop are known at compile-time to figure-out the size at compile-time, for this we only need:range<START,STOP>([optional step]);range<START,STOP>(start,stop[,optional step]); that would be aliases for range(start, Size<STOP-START(+1?)>(stop-start(+1?)) [,optional step]);gael
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |