[AD] Whether to force function inlining

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


Bob:
------------------------------------------
Hi,

Sorry, I *still* can't post on the list :/

I have found that copy/pasting the mixmul code in my own program 
(renaming it to my_fixmul), and removing the __PRECALCULATE_CONSTANTS 
line, I can get about 50% more speed out of it.

If I compile with -S, I can see that fixmul isn't being inlined at all! 
However, by removing __PRECALCULATE_CONSTANTS, fixmul gets correctly 
inlined.

I am using Mingw, gcc v3.2, compiling with: -W -Wall -O3 
-fomit-frame-pointer -ffast-math

See here for various results:
http://www.allegro.cc/forums/view_thread.php?_id=242978&request=1045415850&;
------------------------------------------



Eric:
------------------------------------------
> Sorry, I *still* can't post on the list :/

What's the problem exactly?

> I have found that copy/pasting the mixmul code in my own program
> (renaming it to my_fixmul), and removing the __PRECALCULATE_CONSTANTS
> line, I can get about 50% more speed out of it.
>
> If I compile with -S, I can see that fixmul isn't being inlined at all!
> However, by removing __PRECALCULATE_CONSTANTS, fixmul gets correctly
> inlined.
>
> I am using Mingw, gcc v3.2, compiling with: -W -Wall -O3
> -fomit-frame-pointer -ffast-math

The attached code should have the 4 calls to fixmul() inlined (the results at 
the end of the files are _before_ the change).
------------------------------------------



Bob:
------------------------------------------
Eric Botcazou wrote:
> Sorry, I *still* can't post on the list :/
> 
> 
> What's the problem exactly?

They removed the postmaster account. Since SF requires your ISP to have 
a poastmaster account, it rejects all my emails :/

[snip]
> The attached code should have the 4 calls to fixmul() inlined (the results 
> at  the end of the files are _before_ the change).

Cool. Adding:
  static inline fixed my_fixmul(fixed x, fixed y) 
__attribute__((always_inline));
seems to have resolved the issue. Should we update the AL_INLINE macro 
to use this feature?

Also, __PROECOMPUTE_CONSTANTS still does slow down things slightly.
------------------------------------------



Eric:
------------------------------------------
> Cool. Adding:
>   static inline fixed my_fixmul(fixed x, fixed y)
> __attribute__((always_inline));
> seems to have resolved the issue. Should we update the AL_INLINE macro
> to use this feature?

I think so. Its support was added with gcc 3.1 if I'm not mistaken, probably 
in order to fix the problem you ran into.
 
> Also, __PRECOMPUTE_CONSTANTS still does slow down things slightly.

Weird, I don't see any significant differences in the assembly file, only a 
shift in label index numbers.
------------------------------------------



Bob:
------------------------------------------
Eric Botcazou wrote:
> I think so. Its support was added with gcc 3.1 if I'm not mistaken,
> probably in order to fix the problem you ran into.

Ok.

>>Also, __PRECOMPUTE_CONSTANTS still does slow down things slightly.
> 
> 
> Weird, I don't see any significant differences in the assembly file, only a 
> shift in label index numbers.

Hmm, I didn't check the assembly output. I might just have been unlucky 
and got lower results when running it.
------------------------------------------



Eric:
-----------------------------------------
> > I think so. Its support was added with gcc 3.1 if I'm not mistaken,
> > probably in order to fix the problem you ran into.
>
> Ok.

I may have answered too quickly because it appears that fixmul() and fixdiv() 
are correctly inlined by gcc 3.2.x in normal circumstances (for example the 
test program[*], where there is only one call inside a loop). So I don't 
really know if we should force inlining.

[*] Once you have added 'volatile' to the declaration of x,y,z; otherwise 
both gcc 2.95.3 and gcc 3.2.3 optimize away the function calls!
-----------------------------------------


-- 
Eric Botcazou




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/