Re: [AD] Bug in Allegro's color convertors?

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


> Done. I make some changes to the MMX code to get better timing on the
> i686. I shaved another 2 cycles out of it.

Here are the new results (K6-2/333):

Comparing test profile logs 32to24_MMX.log and 32to24-3_MMX.log

DRAW_MODE_SOLID results:

 putpixel()                       = 103%
 hline()                          = 100%
 vline()                          = 103%
 line()                           = 106%
 rectfill()                       = 103%
 circle()                         = 103%
 circlefill()                     = 102%
 ellipse()                        = 104%
 ellipsefill()                    = 105%
 arc()                            = 104%
 triangle()                       = 101%

Other functions:

 textout()                        = 103%
 vram->vram blit()                = 104%
 aligned vram->vram blit()        = 104%
 blit() from memory               = 105%
 aligned blit() from memory       = 103%
 vram->vram masked_blit()         = N/A
 masked_blit() from memory        = 103%
 draw_sprite()                    = 104%
 draw_rle_sprite()                = 104%
 draw_compiled_sprite()           = 104%
 draw_trans_sprite()              = 103%
 draw_trans_rle_sprite()          = 104%
 draw_lit_sprite()                = 103%
 draw_lit_rle_sprite()            = 103%

Not a single loss anymore. Well done!

> I wasn't able to do anything for the non-MMX code however. It's very tight
> considering the Pentium pairing rules.

Yes, I don't think we can do much better. Did you use a simulator or
anything like that to schedule the instructions ? Your code was fully
pairable right out of the box (I usually make mistakes related to AGI or
cache lines so I need to see the real cycle count).

> What's surprising is the lack of coherency in between the various
> functions. For example, circlefill is slower, but hline has the same
> speed. Same for putpixel vs circle. This is probably due to random
> noise in the system (Windows).

I think that's due to cache misses/memory timing, which are clearly the
bottleneck here.

> I wouldn't worry too much about it, especially since this is the worst
> combination of color depths, speedwise.

Agreed. Entire patch applied to trunk and branch.

--
Eric Botcazou
ebotcazou@xxxxxxxxxx



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/