Re: [eigen] Signed or unsigned indexing |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen <eigen@xxxxxxxxxxxxxxxxxxx>*Subject*: Re: [eigen] Signed or unsigned indexing*From*: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>*Date*: Fri, 20 Jan 2017 09:33:20 -0500*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=XjQWFPJNc4n2hZ2aZAyzZ2MpBIVXFw9026Q145YWAs4=; b=Bf0uIvmvrCoE+V69OEU0tRX9V2v7gRkRvcw9jtR4lz7KCxwt+MQgC/ejPNkbhrE7be r+BCR4bmfL927kY86C2Y0fnLbgdIvFYO097pcLDWgLX4zMqt5c+dz+2jARqmofPocXoH JZAf874avhFheopTK/QgzmbC7I17dhufzYPpen3Otmmmixo7iZDREaeu2QSEXpGIjH3+ LBLd7u6VXx3FkAx4eke1GYMNH6whQg33iMcoEk+EpkOZWWq496YdfnYUyE+J0tKJC0s+ See5kJaGuA9Q2njFau93t8S6ZnJhOznXEyFo3naK5wxQ1ye+qZhLfuBr2Nb39KmP/mFT g6Xg==

It's complicated. The use of signed indexing has a long history specifically in numerical linear algebra, see FORTRAN/BLAS/LAPACK. Also some large C++ shops such as Google have entirely turned away from unsigned indexing:

https://google.github.io/styleguide/cppguide.html#Integer_Types says:

https://google.github.io/styleguide/cppguide.html#Integer_Types says:

```
You should not use the unsigned integer types such as
````uint32_t`

, unless there is a valid
reason such as representing a bit pattern rather than a
number, or you need defined overflow modulo 2^N. In
particular, do not use unsigned types to say a number
will never be negative. Instead, use
assertions for this.

I only mention this to say, there are valid points on both sides of this argument. Neither choice will make much more than half of users happy. (I actually don't like the Google C++ style that much, and I wouldn't mind unsigned personally).

2017-01-20 8:25 GMT-05:00 Henrik Mannerström <henrik.mannerstrom@xxxxxxxxx>:

This issue seems to warrant a separate message thread.I'd like to offer my two cents: As much as I like Eigen I think there is a strict ordering "c++ language" > "stl" > "Eigen". More than once have I developed something and only at some later point brought in Eigen. Code written in std:size_t fashion has then needed refactoring. So, if there would be a vote, I'd vote for size_t indexing. I think smooth interoperability with stl is valuable.Best,HenrikOn Thu, Jan 19, 2017 at 10:58 PM, Márton Danóczy <marton78@xxxxxxxxx> wrote:Hi all,while I would not want to argue with Gael nor with the numerous C++ experts advocating signed integers, today's reality is different. Some libraries, the most prominent being the standard library, use unsigned integers as indices. Therefore, mixing signed and unsigned types is sadly unavoidable when using Eigen, which is a major annoyance (at least for me, working with -pedantic -Werror).In my opinion, using Eigen with -DEIGEN_DEFAULT_DENSE_INDEX_TYPE=size_t should just work, regardless of the developers' personal preference for signed integers.Best,MartonOn 19 January 2017 at 13:00, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:On Thu, Jan 19, 2017 at 11:31 AM, Andrew Fitzgibbon <awf@xxxxxxxxxxxxx> wrote:

I wonder if a rethink of reshape could allow a move to

unsigned index types, assuming I understand correctly

that Dynamic would be of another type. It’s always been abit clunky getting “size_t-correctness” right for mixed

Eigen/STL code, and compilers complain increasingly

nowadays. Perhaps now might be a time to give it a try?Maybe one day we'll get a "fixed" std-v2 that would be more compatible with libraries that made the right choice of using signed types.For sparse matrices, I agree that we might try to allow for unsigned types as the StorageIndex type. This should be doable while keeping signed 64 bits integers for the API (rows, cols, nonZeros, etc.)We might also think about solutions to ease the mix of Eigen/STL code...gael

I see the “downcounting” argument at https://listengine.tuxfamily.o

rg/lists.tuxfamily.org/eigen/2 ,009/03/msg00099.html

but that appears fairly strongly to be a special case where

one would anyway want to benchmark, check sizes etc.

Finally, I think we are in a world where sparse arrays with

entries in the 2-4billion range are reasonably common,

and one could conceivably be pleased to get the extra bit

back…

Thanks again for a great library!

A.

Dr Andrew Fitzgibbon FREng FBCS FIAPR

Partner Scientist

Microsoft HoloLens, Cambridge, UK

From:Gael Guennebaud [mailto:gael.guennebaud@gmail.com ]

Sent:13 January 2017 12:26

To:eigen <eigen@xxxxxxxxxxxxxxxxxxx>

Subject:Re: [eigen] Let's get reshape in shape for 3.4

On Fri, Jan 13, 2017 at 6:14 AM, Jason Newton <nevion@xxxxxxxxx> wrote:

Also, regarding them RowMajor/ColMajor int/type issue - perhaps stuff

them in a new namespace or class - storage ? Too bad StorageOrder is

already used in so many places. Honestly I'm all for you making them

types and things working uniformly from there. I have used them

myself as integers with the flags bitset, but only for enable_if logic

which would be rendered obsolete if you had a collection of C++11

inspired type traits (instead they get repeated on the web a few

places). Sorry if I'm not being very detailed, it's been a while

since I've needed these, but my point is that it was basically a flaw

to use them as int's in the first place, in user code - and so I

encourage you to change things so it all works fluidly in the new api

without fear of upsetting users. Although perhaps that is a daunting

task...

I think you are mixing Eigen::RowMajor with Eigen::RowMajorBit. I agree that the bit flags could be managed differently using individual type traits, but regarding Eigen::RowMajor, it is currently used as a template parameter to Matrix, Array, SparseMatrix, etc.:

Matrix<...., RowMajor|DontAlign>

which is pretty convenient to write compared to having to subclass some default_matrix_traits class to customize the options. With RowMajor|DontAlign, RowMajor could still be instance of an integral_constant-like type with operator | overloaded.... Actually I've started to think about such an approach for Eigen::Dynamic, so that one can write:

M = N*2+1

and get M==Dynamic if N==Dynamic. Currently we always have to write: M = N==Dynamic ? Dynamic : 2*N+1 which is error prone because it's easy to forget about checking for Dynamic, especially when combining multiple compile-time identifiers.

gael

-Jason

On Thu, Jan 12, 2017 at 11:56 PM, Jason Newton <nevion@xxxxxxxxx> wrote:

> Hi Gael,

>

> Glad to see all the new api's you're moving in for the new year.

>

> I actually prefer C if C is a superset of B - that is the way it works

> in Numpy - oder is overridable in several places, but mainly things

> follow the matrix types you are working with (which would be

> expressions here).

>

> I haven't thought about the details but is there any reason

> A.reshaped(4, n/2) work via constexprs or something on the 4? I

> imagine even if it did you're trying to cover for C++98 though, but I

> think fix<4> is a fair bit ugly.

>

> As for the placeholder for a solvable dimension - the matlab

> convension is the empty matrix and I welcome that notion (warped as a

> type) - how about any of, with no priorities:

> Null, Nil, Empty, Filled, DontCare, Placeholder, CalcSize (this and

> the next are more explicit), or AutoSized

>

>

> -Jason

>

> On Thu, Jan 12, 2017 at 10:35 AM, Gael Guennebaud

> <gael.guennebaud@xxxxxxxxx> wrote:

>>

>> Hi everyone,

>>

>> just after generic indexing/slicing, another long standing missing feature

>> is reshape. So let's make it for 3.4.

>>

>> This is not the first time we discuss it. There is a old bug report entry

>> [1]. and a old pull-request with various discussions [2]. The Tensor module

>> also support reshape [3].

>>

>> However, the feature is still not there because we never converged about how

>> to properly handle the ambiguity between col-major / row-major orders, also

>> called Fortran versus C style orders (e.g., in numpy doc [4]).

>>

>> We have several options:

>>

>> A) Interpret the indices in column major only, regardless of the storage

>> order.

>> - used in MatLab and Armadillo

>> - pros: simple strategy

>> - cons: not very friendly for row-major inputs (needs to transpose twice)

>>

>> B) Follows the storage order of the given _expression_

>> - used by the Tensor module

>> - pros: easiest implementation

>> - cons:

>> * results depends on storage order (need to be careful in generic code)

>> * not all expressions have a natural storage order (e.g., a+a^T, a*b)

>> * needs a hard copy if, e.g., the user want to stack columns of a

>> row-major input

>>

>> C) Give the user an option to decide which order to use between: ColMajor,

>> RowMajor, Auto

>> - used by numpy [4] with default to RowMajor (aka C-like order)

>> - pros: give full control to the user

>> - cons: the API is a bit more complicated

>>

>> At this stage, option C) seems to be the only reasonable one. However, we

>> yet have to specify how to pass this option at compile-time, what Auto

>> means, and what is the default strategy.

>>

>> Regarding 'Auto', it is similar to option (B) above. However, as I already

>> mentioned, some expressions do not has any natural storage order. We could

>> address this issue by limiting the use of 'Auto' to expressions for which

>> the storage order is "strongly" defined, where "strong" could mean:

>> - Any expressions with the DirectAccessBit flags (it means we are dealing

>> with a Matrix, Map, sub-matrix, Ref, etc. but not with a generic _expression_)

>> - Any _expression_ with the LinearAccessBit flag: it means the _expression_ can

>> be efficiently processed as a 1D vector.

>>

>> Any other situation would raise a static_assert.

>>

>> But what if I really don't care and just want to, e.g., get a linear view

>> with no constraints of the stacking order? Then we could add a fourth option

>> meaning 'IDontCare', perhaps 'AnyOrder' ?

>>

>>

>> For the default behavior, I would propose 'ColMajor' which is perhaps the

>> most common and predictable choice given that the default storage is column

>> major too.

>>

>>

>> Then, for the API, nothing fancy (I use c++11 for brevity):

>>

>> template<typename RowsType=Index,typename ColType=Index,typename Order=Xxxx>

>> DenseBase::reshaped(RowsType rows,ColType cols,Order = Order());

>>

>> with one variant to output a 1D array/vector:

>>

>> template<typename Order= Xxxx >

>> DenseBase.reshaped(Order = Order());

>>

>> Note that I used "reshaped" with a "d" on purpose.

>>

>> The storage order of the resulting _expression_ would match the optional

>> order.

>>

>> Then for the name of the options we cannot use "RowMajor"/"ColMajor" because

>> they already are defined as "static const int" and we need objects with

>> different types here. Moreover, col-major/row-major does not extend well to

>> multi-dimension tensors. I also don't really like the reference to Fortran/C

>> as in numpy. "Forward"/"Backward" are confusing too. Any ideas?

>>

>> The rows/cols parameters could also be a mix of compile-time & runtime

>> values, like:

>>

>> A.reshaped(fix<4>,n/2);

>>

>> And maybe we could even allow a placeholder to automatically compute one of

>> the dimension to match the given matrix size. We cannot reuse "Auto" here

>> because that would be too confusing:

>>

>> A.reshaped(5,Auto);

>>

>> Again, any ideas for a good placeholder name? (numpy uses -1 but we need a

>> compile-time identifier)

>>

>>

>> cheers,

>>

>> gael

>>

>> [1] http://eigen.tuxfamily.org/bz/show_bug.cgi?id=437

>> [2] https://bitbucket.org/eigen/eigen/pull-requests/41

>> [3]

>> https://bitbucket.org/eigen/eigen/src/default/unsupported/Ei gen/CXX11/src/Tensor/README.md ?fileviewer=file-view-default# markdown-header-operation-resh apeconst-dimensions-new_dims

>> [4]

>> https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/ numpy..reshape.html

**Follow-Ups**:**Re: [eigen] Signed or unsigned indexing***From:*Francois Fayard

**References**:**[eigen] Signed or unsigned indexing***From:*Henrik Mannerström

**Messages sorted by:**[ date | thread ]- Prev by Date:
**[eigen] Using standard C++ conventions** - Next by Date:
**Re: [eigen] Signed or unsigned indexing** - Previous by thread:
**Re: [eigen] Using standard C++ conventions** - Next by thread:
**Re: [eigen] Signed or unsigned indexing**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |