Re: [AD] ftofix optimisation |
[ Thread Index |
Date Index
| More lists.liballeg.org/allegro-developers Archives
]
Erik Sandberg <erik.sandberg@xxxxxxxxxx> writes:
> In my project, every system that uses IEEE floats have a #define
> FLOAT_IEEE_754.
That's fine for fixed platforms like DOS and Windows, where we know the
exact configuration in advance, but more of a problem on Linux where the
configure script finds that information for us. I can't see any existing
header or autoconf method that will reliably tell us the floating point
encoding format, so we'd have to write a test program to discover this,
which I would have no idea how to do (I suppose try reading lots of
floating point constants as binary patterns, and check that they really
are all in IEEE format, but I'd be very nervous that we might get it wrong
and misdetect some other similar format).
But, we could always just conditionalise it on being an I386 platform.
That would enable the optimisation for all the current systems, and would
be trivial to add extra checks for other processors as and when people
start using Allegro on them, and confirm that they are also IEEE
compliant. We'd lose the benefit of the optimisation on unknown
processors, but that seems safer than doing it by mistake on the wrong
sort of hardware :-)
> I have found out why the original version is too slow: It's because of
> the x87's integer rounding (FISTP). By default, the FPU rounds the float
> to the closest int, which is against ANSI. So the compiler has to work
> around this, by overriding the default behaviour.
Urgh, you are right! I've never looked at the generated code for FPU
routines much before, but this is appalling. Surely it would be more
sensible to at least just clear the rounding flag once at program startup,
rather than pushing the FPU flags, changing it, doing the calculation, and
then popping them again, once per routine that uses the FPU?
> If my guesses are correct, we could eliminate 95% of the work in the
> current routine either by making the compiler understand that it does
> not have to disable the FPU's rounding (I have tried it but not
> succeeded yet), or by creating an assembler routine. The latter requires
> gcc and i386, which actually the former one also does. What's your
> preference?
An asm version sounds ok to me, but I'm also happy with your manual
conversion method if you prefer that. Whatever you like: you obviously
know more about this than I do. Doing it in asm does restrict the
portability somewhat, but it's trivial to convert gcc inline asm to Watcom
and MSVC formats (I can certainly make a Watcom version, and probably MSVC
as well although I don't understand the MSVC system so well as the others).
--
Shawn Hargreaves - shawn@xxxxxxxxxx - http://www.talula.demon.co.uk/
"A binary is barely software: it's more like hardware on a floppy disk."