Re: [AD] minor issues

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


some thoughts:

if line length is not on 4 or 8 byte boundarys would MMX be slower due to alignment problems ?

if memory bitmaps are to be made solid (zero extra bytes to make pitch 4 or 8 byte boundarys) than SSE1,2,3 code will be far more difficult. As memory usage is hardly an issue anymore.. (even gfx cards have more memory on them than the average game uses); is there any reason to be concerned with using a few extra bytes per line to achieve 4 or 8 byte boundarys.

SSE can offer some really cool instructions that would allow lots of pixels to be processed simultaneously... lets not design the bitmap structure so we can not take advantage of these.

if the tests below are somehow only better on P4 but worse on Athlon/P3/non-P4 then it should not be implemented, as your average user still would not be using a P4.

clear_to_color() is a more useful feature.


aj.



> Thanks! I'll try later.
>

Ok, here's my results (output of 3 runs, formatted with the attached
script):

15Bits:  0.282..0.321  ->  0.244..0.257  (115..124%)
16Bits:  0.281..0.319  ->  0.244..0.256  (115..124%)
24Bits:  0.641..0.678  ->  0.443..0.474  (144..143%)
32Bits:  0.721..0.721  ->  0.687..0.699  (104..103%)

Just like in your case, for 32bit, the difference is very small (may
well lie within inaccuracy of timing on my system, with just 3 runs).
For 24bit it is highest, but 24bit is the least useful format.

I'd really like to see an analysis of why the libc implementation is
faster than the MMX asm we use currently. I guess it's just the overhead
of looping over the line pointers. Doesn't matter though, it's faster,
so we should use it.

I also used linux and P4 - would be intersting to see other results
(AMD, or P3 or something) before applying.

Then, as Peter said, there's the problem if we can assume that lines are
continuous. If we make this change, then we should specify that
is_memory_bitmap(bmp) means that there can be no line gaps (and also
check that no other memory bitmaps are currently created by Allegro).

Oh, and do you think something similiar can be done for clear_to_color?
That's what really would bring an improvement to actual programs, since
normally you need to clear to some color, not just 0.

--
Elias Pschernig





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/