Re: [eigen] (General question) Floating point: why are 'inf' and 'nan' slow?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


The list of x86 CPUs that don't have SSE2 (SSE is not enough for
double) includes Pentium 3, Athlon XP, VIA C7, AMD Geode, etc. There's
no way that we could neglect all of them performance-wise. Moreover,
even with SSE2, people may still want to use -mfpmath=387 (and it's
the default) in which case the non-vectorized part of Eigen
computations is affected.

It's not a corner case at all, I was wondering if when redesigning the
solvers I could assume it to be safe to produce INF and NAN during the
computation, not just as return values at the end, and the answer is
that I can't do that.

Incidentally, looking at LAPACK, they don't do that either (in DGESVX
they even give up the computation when any pivot is exactly zero).

Benoit


2009/9/23 Rohit Garg <rpg.314@xxxxxxxxx>:
> On Wed, Sep 23, 2009 at 9:12 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> 2009/9/23 Rohit Garg <rpg.314@xxxxxxxxx>:
>>> On Wed, Sep 23, 2009 at 8:17 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>> Ah, passing:
>>>> -mfpmath=sse -msse2 -DEIGEN_DONT_VECTORIZE
>>>>
>>>> does fix the problem (I pass -DEIGEN_DONT_VECTORIZE because I already
>>>> knew that SIMD instructions like mulps avoids the problem; now I can
>>>> see that indeed scalar instructions like mulss also avoid the
>>>> problem).
>>>>
>>>> So, the problem is a non-issue on SSE2-capable systems (SSE2, because
>>>> SSE doesn't support double).
>>>>
>>>> But what about non-SSE2-capable systems, or simply linux distros who
>>>> need to build a generic i686 binary package.... are they out of luck?
>>>>
>>>> The big design decision that I am facing now is this: floating point
>>>> numbers claim to be able to represent special values such as "inf" and
>>>> "nan"; ideally we would play this game, returning "inf" when that is
>>>> the natural result given the user's input; but if that must be 100x
>>>> slower than normal even before the user has any chance of checking if
>>>> that's happening, then in practice we can't do that and we need to
>>>> explicitly avoid generating "inf" and "nan" values even when that
>>>> would be the natural result given the user's input.
>>>
>>
>>> At any rate, eigen needs
>>> to focus on the future, and it is x86-64.
>>
>> No, that doesn't work, for 2 separate reasons:
>>
>> Reason 1: What about all the embedded CPUs and low-power CPUs out
>> there, I'm sure that many of them have the same issues.
>>  -- if, as Jitse's link suggests, the problems are inherent in the
>> design of the x87, then all x87-compatible non-SSE-capable CPUs will
>> have the same problem. That's a lot of embedded CPUs.
>>  -- what about the Intel Atom... etc, etc.
>
> Atom has SSE, SSE2, SSE3, SSSE3. In embedded cpu's, situation is much
> nicer actually. You know the hardware, so such problems can be
> tackled. The only cpu's which don't have any sse are pentium 2 and
> older. Are you sure you want to worry about machines running that old
> cpu's?
>>
>> Reason 2: Linux distros aren't going to drop support for i686 (some
>> even still support i586) anytime soon, we can't change that, and
>> that's all the more going to continue with the current trend of
>> "netbooks". Plus, it's legitimate to want to continue using old
>> machines.
>
> Netbooks have SSEx. see here.
>
> http://www.opensubscriber.com/message/discuss-gnuradio@xxxxxxx/11108339.html
>
> We cannot change that 32 bit will be supported for a while, but inf,
> nan, denormals are a corner case really, and this will make eigen
> perhaps the only math project out there that fights the default
> behavior of cpu's. If you absolutely must, provide a compile time flag
> (it should be opt in, not opt out),  but please, please don't break
> numerics that shock the hell out of every one and his brother.
>
> I'll go out on a limb and say that the majority of programmers don't
> know about these special fp numbers, so it REALLY is a corner case. It
> is not a biggie.
>
> This way, the user will NEVER know if something went wrong with his
> algorithm/coding/data.
>>
>>> Those who use prepackaged software for 32 bit can still make 2
>>> codepaths, detecting CPU at runtime. This is how EVERY body else does
>>> it,and it has worked out pretty well so far.
>>
>> ....and the x87 code path would be generated how? If we designed Eigen
>> without x87 in mind, making x87 as much as 850x slower than it should
>> be (Jitse's link), then that code path would have to be generated
>> using another library? That doesn't work !
>
> If you *expect* to run into these special numbers more than 1% of the
> time, then you don't need a separate library, you need a new
> algorithm, period. And for less than 1%, we dont need to break
> numerics like this.
>
>>
>>> No, we should return inf and nan wherever needed. Reason being inf and
>>> nan usually signal errors in data/algorithm. Not returning them at all
>>> will be a BAD idea.
>>
>> Yes, if it's just a matter of returning them, why not. But my dilemmas
>> start when there are situations when INF may happen in the middle of
>> the computation and one would still have to do the rest of the
>> computation with INF values. That is not reasonable if INF goes 850x
>> slower.
>>
>> Benoit
>>
>>
>>
>
>
>
> --
> Rohit Garg
>
> http://rpg-314.blogspot.com/
>
> Senior Undergraduate
> Department of Physics
> Indian Institute of Technology
> Bombay
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/