Re: [eigen] Let's get reshape in shape for 3.4

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi Gael,

Glad to see all the new api's you're moving in for the new year.

I actually prefer C if C is a superset of B - that is the way it works
in Numpy - oder is overridable in several places, but mainly things
follow the matrix types you are working with (which would be
expressions here).

I haven't thought about the details but is there any reason
A.reshaped(4, n/2) work via constexprs or something on the 4?  I
imagine even if it did you're trying to cover for C++98 though, but I
think fix<4> is a fair bit ugly.

As for the placeholder for a solvable dimension - the matlab
convension is the empty matrix and I welcome that notion (warped as a
type) - how about any of, with no priorities:
Null, Nil, Empty, Filled, DontCare, Placeholder,  CalcSize (this and
the next are more explicit), or AutoSized


-Jason

On Thu, Jan 12, 2017 at 10:35 AM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
>
> Hi everyone,
>
> just after generic indexing/slicing, another long standing missing feature
> is reshape. So let's make it for 3.4.
>
> This is not the first time we discuss it. There is a old bug report entry
> [1]. and a old pull-request with various discussions [2]. The Tensor module
> also support reshape [3].
>
> However, the feature is still not there because we never converged about how
> to properly handle the ambiguity between col-major / row-major orders, also
> called Fortran versus C style orders (e.g., in numpy doc [4]).
>
> We have several options:
>
> A) Interpret the indices in column major only, regardless of the storage
> order.
>   - used in MatLab and Armadillo
>   - pros: simple strategy
>   - cons: not very friendly for row-major inputs (needs to transpose twice)
>
> B) Follows the storage order of the given expression
>   - used by the Tensor module
>   - pros: easiest implementation
>   - cons:
>      * results depends on storage order (need to be careful in generic code)
>      * not all expressions have a natural storage order (e.g., a+a^T, a*b)
>      * needs a hard copy if, e.g., the user want to stack columns of a
> row-major input
>
> C) Give the user an option to decide which order to use between: ColMajor,
> RowMajor, Auto
>   - used by numpy [4] with default to RowMajor (aka C-like order)
>   - pros: give full control to the user
>   - cons: the API is a bit more complicated
>
> At this stage, option C) seems to be the only reasonable one. However, we
> yet have to specify how to pass this option at compile-time, what Auto
> means, and what is the default strategy.
>
> Regarding 'Auto', it is similar to option (B) above. However, as I already
> mentioned, some expressions do not has any natural storage order. We could
> address this issue by limiting the use of 'Auto' to expressions for which
> the storage order is "strongly" defined, where "strong" could mean:
>  - Any expressions with the DirectAccessBit flags (it means we are dealing
> with a Matrix, Map, sub-matrix, Ref, etc. but not with a generic expression)
>  - Any expression with the LinearAccessBit flag: it means the expression can
> be efficiently processed as a 1D vector.
>
> Any other situation would raise a static_assert.
>
> But what if I really don't care and just want to, e.g., get a linear view
> with no constraints of the stacking order? Then we could add a fourth option
> meaning 'IDontCare', perhaps 'AnyOrder' ?
>
>
> For the default behavior, I would propose 'ColMajor' which is perhaps the
> most common and predictable choice given that the default storage is column
> major too.
>
>
> Then, for the API, nothing fancy (I use c++11 for brevity):
>
> template<typename RowsType=Index,typename ColType=Index,typename Order=Xxxx>
> DenseBase::reshaped(RowsType rows,ColType cols,Order = Order());
>
> with one variant to output a 1D array/vector:
>
> template<typename Order= Xxxx >
> DenseBase.reshaped(Order = Order());
>
> Note that I used "reshaped" with a "d" on purpose.
>
> The storage order of the resulting expression would match the optional
> order.
>
> Then for the name of the options we cannot use "RowMajor"/"ColMajor" because
> they already are defined as "static const int" and we need objects with
> different types here. Moreover, col-major/row-major does not extend well to
> multi-dimension tensors. I also don't really like the reference to Fortran/C
> as in numpy. "Forward"/"Backward" are confusing too. Any ideas?
>
> The rows/cols parameters could also be a mix of compile-time & runtime
> values, like:
>
> A.reshaped(fix<4>,n/2);
>
> And maybe we could even allow a placeholder to automatically compute one of
> the dimension to match the given matrix size. We cannot reuse "Auto" here
> because that would be too confusing:
>
> A.reshaped(5,Auto);
>
> Again, any ideas for a good placeholder name? (numpy uses -1 but we need a
> compile-time identifier)
>
>
> cheers,
>
> gael
>
> [1] http://eigen.tuxfamily.org/bz/show_bug.cgi?id=437
> [2] https://bitbucket.org/eigen/eigen/pull-requests/41
> [3]
> https://bitbucket.org/eigen/eigen/src/default/unsupported/Eigen/CXX11/src/Tensor/README.md?fileviewer=file-view-default#markdown-header-operation-reshapeconst-dimensions-new_dims
> [4]
> https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.reshape.html



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/