Re: [AD] Allegro x86 clear and blit optimizations - update

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


> When I wrote the 32bpp MMX code, it turned out to be 10% slower than the
>  regular non-MMX one. Perhaps MMX instructions have trouble with getting
>  data from system RAM instead of L2 cache? Just a guess though.

I don't know why the MMX instructions in the clears would have to read the 
cache line (except of course PPro+ write miss cache line load), EXCEPT the 
initial per-line setup for the writes, which is simpler than for the 8- and 
16-bit code. Admittedly, I haven't tried it yet, however the 8-bit and 16-bit 
MMX clears are (at least on my PMMX system) roughly twice as fast as the 
non-MMX clears, and for as much of a gain I would think the MMX optimizations 
would be worthwhile.

C



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/