Re: [eigen] Overflow in sum()

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Well, I remember now, in our case we actually avoid the overflow issue by quantizing our feature vector elements to only 4 bits(!), then we can add many of them together as uint8's without fear. One use case we have is taking the average of about 50 176-dimensional uint8 vectors, with uint8 vector as the result. With uint8 SIMD specializations Eigen might already match our hand-written code for this, I would be very interested to see.

We also need very efficient L1 distance between two such vectors, returning uint as the accumulation type. Is there no MatrixBase::abs() function? It would be cool to write something like

uint distance = (v1 - v2).abs().sum(); // maybe insert cast<uint>() somewhere

and have it compile to use _mm_sad_epu8 and _mm_add_epi16 to compute the sum-of-absolute-differences. This seems like it requires some tricky _expression_-template specialization. I can send example code but this is getting a little off-topic, maybe should be a different thread.

It would be nice to have a transparent way of controlling accumulation type and policy like Benoit's proposal, but maybe it's not sufficiently clear yet what typical use cases are... if my understanding is correct, the cast() approach should be adequate for my purposes. How does cast() influence SIMD optimizations? And probably there should be a prominent warning to only use scalar types like uint8 if you know what you're doing :)..

Cheers,
Patrick

On Thu, Apr 2, 2009 at 7:04 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
The idea was also to use it in matrix products, dot product, etc.
But then I have doubts: if really all these operations overflow then
maybe the user really needs to use a bigger scalar type in the first
place. There isn't much of a point in using uint8 if you need to cast
everytime you do something nontrivial... i'd need to see how this is
used in practice.

Benoit

2009/4/2 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> I have to say I'm not sure to fully understand your proposal. Where
> would you use the additional scalar type and ei_saturate ? only in
> sum() ? to tweak the return type of operator+ and so ?
>
>
> On Thu, Apr 2, 2009 at 1:56 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> I agree, that for Jens' use case, the cast() approach works well; now
>> what do the dense-matrix-of-uint8 guys think? If they need to cast all
>> the time then I guess this isn't convenient. Putting that into the
>> NumTraits had the advantage of being very extensible without making
>> the rest of Eigen significantly more complex. But we'll do it only if
>> people need it...
>>
>> Benoit
>>
>> 2009/4/2 Jens Mueller <jens.k.mueller@xxxxxx>:
>>> Oh yes. You are right. For me cast<>() is best. And even for the others,
>>> I guess. Just cast the values such that there will be no overflows.
>>>
>>> Regards,
>>> Jens
>>>
>>> Gael Guennebaud wrote:
>>>> what about:
>>>>
>>>> mat.cast<uint>().sum()
>>>>
>>>> (this already works without creating any temporary matrix)
>>>>
>>>> gael.
>>>>
>>>> On Thu, Apr 2, 2009 at 11:03 AM, Jens Mueller <jens.k.mueller@xxxxxx> wrote:
>>>> > The first proprosed plugin solution is acceptable for me. Since this
>>>> > plugin feature is neat anyway.
>>>> > If this overflow thing really matters for users, then extending
>>>> > NumTraits is reasonable, since people with custom matrices write their
>>>> > own NumTraits declarations anyway. So at the moment there are three
>>>> > users bothered with this problem. Is this considered enough?
>>>> >
>>>> > Regards,
>>>> > Jens
>>>> >
>>>> > Benoit Jacob wrote:
>>>> >> Thanks for your input; here's a proposal.
>>>> >>
>>>> >> First of all, what I really don't want is to add another template
>>>> >> parameter to Matrix and to add more enums/typedefs in xpr classes. So
>>>> >> I'm trying to put all the stuff related to that issue in the numeric
>>>> >> type itself. This way, only people who use that feature, pay for it.
>>>> >>
>>>> >> So here's the proposal. in NumTraits<T> we just add another typedef
>>>> >> for the accumulation type, and we define a function ei_saturate<T>(T
>>>> >> t) for all possible accumulation types T. For all basic types
>>>> >> (including uint8) we define the accumulation type to be just T and
>>>> >> ei_saturate(t) returns just t, so the default for integer types is the
>>>> >> clipping behavior. But then we can define (or let the user define)
>>>> >> more types, like a type that is a wrapper around uint8 with the
>>>> >> accumulation type set to uint32 and ei_saturate(t) actually saturating
>>>> >> to 255, etc. The user can then use a matrix with that type as scalar
>>>> >> type. If for a different matrix he wants a different accum type, he
>>>> >> just uses a different scalar type with a different type as
>>>> >> NumTraits<T>::AccumType. In fact, it's even OK to let the AccumType be
>>>> >> a template parameter of the wrapper-around-uint8 scalar type.
>>>> >>
>>>> >> Cheers,
>>>> >> Benoit
>>>> >>
>>>> >> 2009/4/1 Christian Mayer <mail@xxxxxxxxxxxxxxxxx>:
>>>> >> > -----BEGIN PGP SIGNED MESSAGE-----
>>>> >> > Hash: SHA256
>>>> >> >
>>>> >> > Benoit Jacob schrieb:
>>>> >> >> ah that's yet another different problem...
>>>> >> >>
>>>> >> >> with sparse matrices, the overflow problem arises only with a few
>>>> >> >> operations like sum().
>>>> >> >>
>>>> >> >> But with what you say, dense matrices of uint8, the overflow can
>>>> >> >> happen with a lot more operations: for example, any matrix product
>>>> >> >> will typically overflow! So I guess that a matrix product of unit8
>>>> >> >> matrices should return a matrix with a bigger scalar type...
>>>> >> >
>>>> >> > I'm working in the area of embedded software development where huge
>>>> >> > amounts of numerical software has to be dealt with in real time. As some
>>>> >> > of that code is quite old it's mostly written in fixed point, i.e. it's
>>>> >> > using normal integer operations (the variables usually have to be
>>>> >> > multiplied by a factor and an offset has to be added to get the "real
>>>> >> > world meaning" of each variable).
>>>> >> >
>>>> >> > In this context lots of programmers are looking for data types and
>>>> >> > possible overflows just to handle that problem, as each multiplication
>>>> >> > and addition could cause a problem.
>>>> >> >
>>>> >> > And there are usually two solutions: saturate or clip (i.e. just use the
>>>> >> > modulo value).
>>>> >> >
>>>> >> > => that's too far for eigen to handle
>>>> >> >
>>>> >> > But it might be a nice solution to offer (as a template parameter) a
>>>> >> > different data type as an result, so that the eigen user can handle this
>>>> >> > requirement each time it could appear in the best way for his algorithm.
>>>> >> > (And if possible we could also offer an saturating and an clipping
>>>> >> > version of the algorithm)
>>>> >> >
>>>> >> > CU,
>>>> >> > Chris
>>>> >> >
>>>> >> > -----BEGIN PGP SIGNATURE-----
>>>> >> > Version: GnuPG v1.4.9 (GNU/Linux)
>>>> >> >
>>>> >> > iEYEAREIAAYFAknTwKMACgkQoWM1JLkHou3scwCfSlLGepOwfZHX7/2a+X7de/MC
>>>> >> > eZcAn2a0doVjFW4/M8witEbO0Op8HWPI
>>>> >> > =eZE/
>>>> >> > -----END PGP SIGNATURE-----
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >>
>>>> >
>>>> >
>>>> >
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/