Re: [AD] ftofix optimisation

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


Erik Sandberg <erik.sandberg@xxxxxxxxxx> writes:
> In my project, every system that uses IEEE floats have a #define
> FLOAT_IEEE_754.

That's fine for fixed platforms like DOS and Windows, where we know the 
exact configuration in advance, but more of a problem on Linux where the 
configure script finds that information for us. I can't see any existing 
header or autoconf method that will reliably tell us the floating point 
encoding format, so we'd have to write a test program to discover this, 
which I would have no idea how to do (I suppose try reading lots of 
floating point constants as binary patterns, and check that they really 
are all in IEEE format, but I'd be very nervous that we might get it wrong 
and misdetect some other similar format).

But, we could always just conditionalise it on being an I386 platform. 
That would enable the optimisation for all the current systems, and would 
be trivial to add extra checks for other processors as and when people 
start using Allegro on them, and confirm that they are also IEEE 
compliant. We'd lose the benefit of the optimisation on unknown 
processors, but that seems safer than doing it by mistake on the wrong 
sort of hardware :-)

> I have found out why the original version is too slow: It's because of 
> the x87's integer rounding (FISTP). By default, the FPU rounds the float 
> to the closest int, which is against ANSI. So the compiler has to work 
> around this, by overriding the default behaviour.

Urgh, you are right! I've never looked at the generated code for FPU 
routines much before, but this is appalling. Surely it would be more 
sensible to at least just clear the rounding flag once at program startup, 
rather than pushing the FPU flags, changing it, doing the calculation, and 
then popping them again, once per routine that uses the FPU?

> If my guesses are correct, we could eliminate 95% of the work in the 
> current routine either by making the compiler understand that it does 
> not have to disable the FPU's rounding (I have tried it but not 
> succeeded yet), or by creating an assembler routine. The latter requires 
> gcc and i386, which actually the former one also does. What's your 
> preference?

An asm version sounds ok to me, but I'm also happy with your manual 
conversion method if you prefer that. Whatever you like: you obviously 
know more about this than I do. Doing it in asm does restrict the 
portability somewhat, but it's trivial to convert gcc inline asm to Watcom 
and MSVC formats (I can certainly make a Watcom version, and probably MSVC 
as well although I don't understand the MSVC system so well as the others).



--
Shawn Hargreaves - shawn@xxxxxxxxxx - http://www.talula.demon.co.uk/
"A binary is barely software: it's more like hardware on a floppy disk."



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/