[AD] 16-bit MMX Clear DONE !

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


Ok, here it is:
It's a full replacement for iblit16.s

I kept only the segment-prefixed code as I get absolutely no speed decrease for using it instead of the flat model. Test.exe reports 5910 iterations of clear(320x200x16) per second, which amounts to 721 MB/sec...Pretty close to the max 100MHz SDRAM bandwidth (800MB/sec). That's an 11% increase compared to the non-MMX version (5490 iterations/sec)

I tried coding a version for 32-bit bitmaps, but it ended being 8% slower than the non -MMX one. That was funny actually since test.exe reports only 770 iterations per second for the non-MMX version (that's 187 MB/sec) I was expecting around 3000 / sec (pixels containing 2x the data should blit at half the speed no ?).

These tests have been performed on my Intel Celery 451 Mhz (100Mhz SDRAM), under DOS 6.22.


Oh yeah, for some reason in DOS-in-win95 (4.00.1111), the MMX version is 1% slower than the non-MMX in average. How come ? Windows resets the FPU every now and then ?

Anyway, I wonder how portable the code is (to Linux/XWin/BeOS that is). Any takers ?

Note: I don't think a 8bpp version would help as it would require too much overhead for alignments and such to make it through the 32-byte copy loop.

Attachment: Iblit16.zip
Description: Zip archive

--

 - Robert J Ohannessian


There is always one more bug.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/