Re: [AD] Allegro 4.2.0 RC1 timetable |
[ Thread Index |
Date Index
| More lists.liballeg.org/allegro-developers Archives
]
On 2005-06-05, Peter Wang <tjaden@xxxxxxxxxx> wrote:
> On 2005-06-05, Evert Glebbeek <eglebbk@xxxxxxxxxx> wrote:
> > On Saturday 04 June 2005 16:07, Peter Wang wrote:
> > > - upgrades to fixmul, fixdiv
> >
> > Right. We did agree on what the final patch would be, right?
>
> Yes, as per http://www.allegro.cc/forums/view_thread.php?_id=494345
> in case some here didn't see it:
>
> IA32/asm: fixmulasm, fixdivasm
> IA32/C: fixmuli, fixdivf
> otherwise: fixmull, fixdivf
>
> You can change fixdivf to fixdivl for non-IA32 platforms if you figure
> out the problem.
Here's a patch to do that, using the implementations of fixmuli and
fixmull from your benchmark. Apply it when you want.
Peter
Index: include/allegro/inline/fmaths.inl
===================================================================
RCS file: /cvsroot/alleg/allegro/include/allegro/inline/fmaths.inl,v
retrieving revision 1.7
diff -u -r1.7 fmaths.inl
--- include/allegro/inline/fmaths.inl 1 Feb 2005 13:12:04 -0000 1.7
+++ include/allegro/inline/fmaths.inl 5 Jun 2005 07:42:33 -0000
@@ -103,7 +103,40 @@
AL_INLINE(fixed, fixmul, (fixed x, fixed y),
{
- return ftofix(fixtof(x) * fixtof(y));
+ /* In benchmarks conducted circa May 2005 we found that, in the main:
+ * - IA32 machines performed faster with one implementation;
+ * - AMD64 and G4 machines performed faster with another implementation.
+ *
+ * Benchmarks were mainly done with differing versions of gcc.
+ * Results varied with other compilers, optimisation levels, etc.
+ * so this is not optimal, though a tenable compromise.
+ */
+ #if (defined ALLEGRO_I386) || (!defined LONG_LONG)
+
+ fixed sign = (x^y) & 0x80000000;
+ int mask_x = x >> 31;
+ int mask_y = y >> 31;
+ int mask_result = sign >> 31;
+ fixed result;
+
+ x = (x^mask_x) - mask_x;
+ y = (y^mask_y) - mask_y;
+
+ result = ((y >> 8)*(x >> 8) +
+ (((y >> 8)*(x&0xff)) >> 8) +
+ (((x >> 8)*(y&0xff)) >> 8));
+
+ return (result^mask_result) - mask_result;
+
+ #else
+
+ LONG_LONG lx = x;
+ LONG_LONG ly = y;
+ LONG_LONG lres = (lx*ly)>>16;
+ int res = lres;
+ return res;
+
+ #endif
})