Re: [eigen] Indexes: why signed instead of unsigned?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On 5/15/10, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2010/5/15 leon zadorin <leonleon77@xxxxxxxxx>:
>> On 5/14/10, Manoj Rajagopalan <rmanoj@xxxxxxxxx> wrote:
>>>
>>>> There remains the question of signed vs. unsigned. In other words,
>>>> ptrdiff_t vs. size_t. I'm totally unable to decide either way. Help!
>>>>
>>>> Benoit
>>>>
>>>
>>> Would it be a bad idea to add the integer-type as a template parameter
>>> and
>>> let
>>> the user decide based on his/her "taste"?
>>
>>
>> I like this idea possibly the most.
>>
>> It allows for the most-customizable approach. In some of my progs, I
>> have similar mechanisms where the exact declaration of int resolution
>> and float resolution are being kept outside the logic of the
>> underlying, possibly library-level, mechanisms and algorithms.
>>
>> Whether such declarations are explicit template parameters as in "each
>> template parameter for each resolution" or whether there is a commonly
>> expected "traits types policy" bundled type which is plugged into the
>> vector/matrix/etc. as a single template arg, etc. etc. etc. is also
>> fine by me...
>
> We really don't have to add yet another template parameter, especially
> not to the dense Matrix and Array classes where this feature il almost
> useless (save up to a dozen bytes on a dynamically sized matrix,
> woohoo, and save zero bytes on fixed-size matrices).

OK, I see -- if memory savings is the only concern, then sure: dozen
bytes is not an issue-breaker.

I was only looking at this from a point of view where if one is to
allow customization of *data* types for matrices et al (double, float,
mpfr types, etc.) either for numeric range capacity or numeric
efficiency in calculations...

.... then the similar principle could be justified for specifying the
*metadata* types (subscript ops etc.) -- once again, not only for the
sake of the increased resolution, but also for the purposes of speed
such as better utilization of CPU cache-lines (esp. for the
frequently-used portions of the code).

For example, on a 64bit sys (freebsd-7.2 on x86_64):

sizeof(uint_fast32_t) is still 4 (i.e. != sizeof(uint_fast64_t))

thereby implying to me that if I am running a 64 bit platform and I
*don't* need > 4billion(s) elements in my matricies et al, then using
(u)int_fast32_t may be faster/more-efficient (as per vendors
evaluation of 'fast') than the implied 64-bit variant...

Having said this -- if Eigen does not do much frequent
referencing/counting/etc w.r.t. it's integral metadata and the only
concern is the overall memory impact on just the RAM whilst the
performance is completely unaffected, then sure -- what you had stated
earlier is fine.

> If and when someone needs this we can always make that part of the
> existing Options template parameter.
>
> For SparseMatrix, the situation may be different as it's a more useful
> feature, anyway we're not yet close to offer API stability in the
> Sparse module (at the meeting the plan was to move it to unsupported)
> so I'll leave it to Gael to decide what he wants to do there!
>
> Benoit
>
>
>>
>> Keeping in mind that one may also want to have a single program which
>> wants to use two distinct instances of the eigen-related mechanisms:
>> one with large numeric range/resolution and another not; this approach
>> (i.e. template-based definition of int resolution) would also allow
>> for such a finer-level customization.
>>
>> kind regards
>> Leon.
>>
>>> A small, non-eigen, contrived
>>> example:
>>>
>>> template<typename T, typename I=int>
>>> class vector
>>> {
>>
>>
>>> public:
>>>     typedef I idx_type;
>>>     idx_type rows() const;
>>>     idx_type cols() const;
>>>     T const& operator [] (idx_type const& n) const;
>>>     // etc.
>>> };
>>>
>>> Instantiations like vector<double> will default to using int for
>>> index-type
>>> as
>>> Eigen has all along (for backward compatibility).
>>>
>>> Instantiations like vector<double, ptrdiff_t> will cover user-desired
>>> cases.
>>>
>>> Since Eigen is header-only and doesn't have to worry about
>>> library-binary-compatibility across platforms and versions, this change
>>> could
>>> be a one-size-fits-all solution (assuming there are no caveats that I
>>> have
>>> missed). Of course, it is a bigger headache for the library programmers
>>> :-)
>>> It will also be a bigger testing issue but these tests can be generated
>>> since
>>> templates are being used. The suite will just take longer to run.
>>>
>>> When writing loops with down-counters maybe some kind of static assertion
>>> or
>>> warning could be included if an unsigned type is used? This could be
>>> achieved
>>> with a traits struct.
>>>
>>> The documentation could warn users about the pitfalls of using unsigned
>>> types
>>> by consolidating this recent discussion.
>>>
>>> Someone raised a question about large indices. I had a friend in image
>>> processing who dealt with very large vectors, since in a raw image we
>>> have
>>> MxN pixels with RGBA channels for each pixel. So it might make sense to
>>> allow
>>> for large indices on machines that can support them. Also, we can imagine
>>> dealing with volumetric image data that resides on disk and is paged into
>>> RAM
>>> on-demand by a library like STXXL or Global-Arrays and might require
>>> large
>>> indices for "global" indexing.
>>>
>>> More generally, large indices can result from linearizations of
>>> multidimensional grids - my simulations involve 3D real-space and its
>>> related
>>> 3D reciprocal space and I sometimes work with distributions that are
>>> therefore 6-dimensional. Another example: state-spaces in quantum
>>> computing
>>> grow exponentially with number of qubits (tensor-product spaces of dim
>>> 2^{#bits}) and related simulations might quickly require large indices
>>> when
>>> the number of bits crosses 31.
>>>
>>> Just my 2 bits.
>>>
>>> Thanks,
>>> Manoj
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/