Re: [AD] Bug in Allegro's color convertors? |
[ Thread Index |
Date Index
| More lists.liballeg.org/allegro-developers Archives
]
> Done. I make some changes to the MMX code to get better timing on the
> i686. I shaved another 2 cycles out of it.
Here are the new results (K6-2/333):
Comparing test profile logs 32to24_MMX.log and 32to24-3_MMX.log
DRAW_MODE_SOLID results:
putpixel() = 103%
hline() = 100%
vline() = 103%
line() = 106%
rectfill() = 103%
circle() = 103%
circlefill() = 102%
ellipse() = 104%
ellipsefill() = 105%
arc() = 104%
triangle() = 101%
Other functions:
textout() = 103%
vram->vram blit() = 104%
aligned vram->vram blit() = 104%
blit() from memory = 105%
aligned blit() from memory = 103%
vram->vram masked_blit() = N/A
masked_blit() from memory = 103%
draw_sprite() = 104%
draw_rle_sprite() = 104%
draw_compiled_sprite() = 104%
draw_trans_sprite() = 103%
draw_trans_rle_sprite() = 104%
draw_lit_sprite() = 103%
draw_lit_rle_sprite() = 103%
Not a single loss anymore. Well done!
> I wasn't able to do anything for the non-MMX code however. It's very tight
> considering the Pentium pairing rules.
Yes, I don't think we can do much better. Did you use a simulator or
anything like that to schedule the instructions ? Your code was fully
pairable right out of the box (I usually make mistakes related to AGI or
cache lines so I need to see the real cycle count).
> What's surprising is the lack of coherency in between the various
> functions. For example, circlefill is slower, but hline has the same
> speed. Same for putpixel vs circle. This is probably due to random
> noise in the system (Windows).
I think that's due to cache misses/memory timing, which are clearly the
bottleneck here.
> I wouldn't worry too much about it, especially since this is the worst
> combination of color depths, speedwise.
Agreed. Entire patch applied to trunk and branch.
--
Eric Botcazou
ebotcazou@xxxxxxxxxx