Re: [AD] ftofix optimisation

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


>The IEEE floating point encoding is pretty standard these days, and this 
>method will work regardless of endianess, so in practice this code is 
>probably very portable. Certainly it's ok for all DOS and Windows versions, 
>but what about Unix? We need some way to check for the specific float 
>encoding in use, which I can't see any mention of in the standard autoconf 
>macros, and would have no idea how to tackle myself. Any ideas, anyone?

In my project, every system that uses IEEE floats have a #define
FLOAT_IEEE_754. The optimised ftofix is then done within an #ifdef
FLOAT_IEEE_754, the unoptimised i

>The ftofix() function is prototyped as taking a double, so I think that's 
>probably a good idea (classic C always promotes float arguments to double 
>before passing them on the stack in any case). What sort of performance 
>implications would this change have? You could always just copy the double 
>parameter to a local float before munging it into a fixed,

This conversion would require about as much work as the extra int code
required to handle doubles (about 5 clock cycles depending on architecture).

>but that would 
>lose some precision, which I think can make a difference: for large values, 
>a 16.16 fixed encoding actually has more significant bits than a 23 bit 
>floating point mantissa (not totally sure about that one, though: am I right 
>here?).

Everything above 128.0 will lose precision if float is used.

>But to access the full contents of a double would require using 
>either 64 bit integer types (not portable), or pointer arithmetic to read 
>two separate 32 bit values (which raises endianess issues). 

The endianness issue would easily be solved with #ifdefs:

#ifdef BIG_ENDIAN
a = ((int *)&f)[0];
b = ((int *)&f)[1];
#else
a = ((int *)&f)[1];
b = ((int *)&f)[0];
#endif

>So I'm not sure 
>about this one: I do think it would be a problem to switch to any method 
>less accurate than the present one, but is it possible to make a double 
>version of this work efficiently enough to be a gain?

I have found out why the original version is too slow: It's because of  the
x87's integer rounding (FISTP). By default, the FPU rounds the float to the
closest int, which is against ANSI. So the compiler has to work around
this, by overriding the default behaviour. This probably takes up more than
90% of the time. But we do not want this ANSI behaviour, but rather the
default FPU one, so we have to do some extra work (perhaps 5% of the time)
for manual rounding. If my guesses are correct, we could eliminate 95% of
the work in the current routine either by making the compiler understand
that it does not have to disable the FPU's rounding (I have tried it but
not succeeded yet), or by creating an assembler routine. The latter
requires gcc and i386, which actually the former one also does. What's your
preference?

Erik




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/