[eigen] [Review] Pull request 66, Huge Tensor module improvements |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
Hi there,
(Warning: This is going to be a long mail. tl;dr: I've reviewed your
changes and I have a lot of comments. Nevertheless, apart from two
issues - one a class name change, the other some things with impact to
the rest of Eigen to discuss - I think these changes should be merged,
and development and improvements should continue from that starting point.)
> I have been working on extending the Tensor class quite a bit in
> http://bitbucket.org/benoitsteiner/eigen and I am reaching the point
> where the code is almost ready to be useful.
Oh, that's unexpected. ;-)
So first of all, thanks for the interest and your willingness to help
improve my little addition to Eigen. I'm glad I'm not alone when it
comes to people interested in the tensor module.
But just for the future: next time could you perhaps tell me in advance
you're working on something? It's a good thing I only put together some
ideas so far but didn't actually start coding something yet, otherwise
we'd have duplicated a lot of work.
I've taken a bit of time to review the changes you've made (not 100% of
them, but I've looked at most of it).
Generally speaking: I *really* appreciate that you've spent a LOT of
time implementing all this tedious, repetitive stuff and that now there
are quite a few features in there. Thank you!
That said, there are a lot of details which I think should be improved.
I personally took the approach of first writing stuff down in the wiki
and *then* wanting to implement it, because I wanted to get these kind
of details right in the first place. On the other hand, I have no code
to show here and you have implemented several thousand lines, so maybe
your approach of first doing something and then worrying about the
details is the better one. ;-)
Anyway, here's a list of things I think should be improved (sorry it's
so much at once, I don't mean to discourage you, but you sprung O(1000)
LOC on me just a few days ago without warning. ;-)):
1. C++03 compatibility emulation. I see that you went down that rabbit
hole and came out alive and well, where I thought it would be far
too complicated to emulate that with lots of variants of templates.
Seeing your code, I actually see it's quite doable and not really
that difficult. But I think there are some things that can be
improved:
a. Since Tensor/ does not necessarily need C++11 anymore, we should
probably move it to unsupported/Eigen/Tensor. I don't see you
implementing TensorSymmetry/ in C++03 however, so I think that
will have to stay under unsupported/Eigen/CXX11/.
b. Currently, since you detect the C++ version to decide which header
to include, this breaks people (read: me ;-)) using
TensorSymmetry/ if they accidentally compile it with C++ <= 03 -
instead of getting a nice error message, they get lots of
substitution errors, because the C++11 version of the header is
not included.
I don't think detection and emulation is the wrong way to go, but
maybe the best thing is (after doing (1a)) to just write a header
CXX11/CoreOrEmulated that includes CXX11/Core or CXX11/Emulated
depending on the compiler version. Tensor/ would then include
CXX11/CoreOrEmulated, but TensorSymmetry/ would include
CXX11/Core.
(Obviously, the tensor tests should then be named differently.)
c. Eigen::array is a bad name decision for emulating std::array,
since array kind-of-clashes with Eigen::Array - it's going to be
hell for people using Eigen's traditional capital-A Array
template if they accidentally have a typo.
Maybe Eigen::std_array instead? With a macro EIGEN_STD_ARRAY that
maps to either std::array (C++11) or Eigen::std_array?
Btw.
template <typename T, std::size_t N> using array =
std::array<T, N>;
That requires g++ 4.7 or 4.8, my code (even the crazy template
stuff for TensorSymmetry) supports g++ 4.6 and I'd prefer to keep
it that way. (Some older computing clusters still have only g++
4.6 available.)
Btw2. In array_apply_and_reduce etc. you use a for loop instead of
recursion. From my experience, compilers do not automatically
tend to unroll this loop, making it difficult for the optimizer
to generate optimal code here. EIGEN_STRONG_INLINE + recursion is
better in my experience.
d. I don't like the Sizes<> template. I think it's better to just
specify the sizes for fixed size tensors directly.
Example:
Tensor<double, 3, /* Options = */ 0, 3, 3, 3> epsilon;
(3x3x3 tensor)
Compare to:
Matrix<double, 3, 3> foo;
(3x3 matrix)
Of course, we have to put Options before the variadic arguments
here, so there is some difference, but I still think this is more
consistent with the rest of Eigen here.
e. Nitpicking: you support up to 5 indices in C++03, but IIRC Fortran
supports 7, so maybe we should match that?
f. Unsure: enable_if could probably used instead of a lot of
static asserts. OTOH, maybe using assertions is not the worst
idea here. Unsure about the best course of action here.
2. GPU / Threading support: So you probably implemented this here,
because you needed it yourself. I'm all for it, and the
EIGEN_DEVICE_FUNC additions definitely should go in (they're in
Eigen/Core after all). And while I think it's really cool what you
did there, with .device() and all that, I think this is something
that affects all of Eigen and I wouldn't want to add this to the
Tensor module without first having a more general discussion about
it.
Btw. do you know how your own thread pool implementation compares to
a simple openmp parallelization (-fopenmp with g++ and #pragma omp
parallel for in front of the assignment loop for example)? I've
never tried std::launch:async so far and I have no idea how much
overhead it has.
3. General class structure:
a. I don't like two classes TensorFixedSize and Tensor. I think
there should be only one class, like Matrix or Vector.
b. You only have one base class right now, which is fine for the
features implemented so far, but I think at this stage we should
really go and replicate the entire class hierarchy of Eigen,
even if some subclasses are not really needed right now, just to
make it easier later to add more features.
4. General observations:
a. For all evaluators, you assume something like LinearAccessBit,
which is fine as long as one doesn't implement some kind of
generalized .block() for tensor and/or Maps with strides.
b. The HasBlend / pblend() stuff is probably also something that
could benefit the rest of Eigen, right? OTOH, the for loop over
the packet size seems potentially inefficient to me (at least on
CPUs) - can't you just do a packet() on m_condImpl and store that
in the selector?
c. w.r.t. writePacket() - you've kind of answered the question I
posed in the original posting of this thread, namely that it
is only called from inside Eigen and therefore doesn't need
variadic arguments.
d. Nitpicking: Why did you change #include <...> to #include "..."
for things that should be in the include path? The only things
in "..." should be references to src/... from within the main
include headers, at least if I follow the style of the rest of
Eigen.
5. API stuff:
a. "Tensor" class prefix. For the internal classes you obviously
went the TensorXXX route, i.e. TensorEvaluator, TensorCwiseXXXOp
etc. Since I raise this as a question in this thread, I thought
about it some more and it's probably the easiest solution.
That said, I really don't like names like TensorMap<> etc. that
people will then use from the public API. Also, later on there's
probably going to be something like Ref<> etc. I think it should
be better to leverage the same names for the Tensor module, and I
think I have an idea how to do that without making the rest of
Eigen fall apart, but I have to try to see if it works first.
Also, w.r.t. GPU/Threading: TensorDevice seems a really weird
name for what it's doing. Probably best to create a module
unsupported/Eigen/Parallelization, name the class ComputeDevice
or something along those lines and make Tensor make use of that
module.
b. I don't like the API for convolutions and contractions. It took me
a while to understand what you are doing. (I think without the
tests I'd still not understand it.) OTOH, any better API is
going to be a *LOT* more complicated to implement and it's
probably better to have something that's useful instead of
nothing. Still, I don't like "burning" names such as "convolve"
or "contract" for this, because at some later point with a better
API that could come to bite us.
Perhaps we could rename them to basicContract() and
basicConvolve() so that the features are there, but later
implementations with a more usable API can still use the stuff
implemented so far.
-------------------------- </review> --------------------------
That all said, I think the best way to move forward is to merge your
changes sooner rather than later, otherwise it's going to be a huge
mess. I see two things that should definitely be handled from my
perspective before merging:
* change the name Eigen::array to Eigen::std_array or something
similar. The possibility of confusion w.r.t. Eigen::Array is too
great in my eyes.
* The GPU/std::async/future type threading that you implemented.
We should definitely discuss this with the Core developers of
Eigen before merging this. It's probably fine to first use the
Tensor module as a playground until it's mature enough to be
used in Core/, but I wouldn't want to merge something along these
lines unless there's some kind of consensus that this is the way
to go.
After those two things I think the next step would be to merge the rest
of your stuff as-is for the moment being.
After that and some further discussion and coordination with you and any
other interested party (btw. my comments above are not set in stone,
they only reflect my personal opinion, which might change during a
discussion, just in case that wasn't clear ;-)), I'd want to go after
the things I mentioned here and improve upon the things you have right
now. Does this procedure sound agreeable?
Anyway, I've rambled on for far too long now. ;-)
Best Regards,
Christian