RE: [AD] Use MMX to get fast |
[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]
Blit to/from 16-bpp bitmaps is already implemented with MMX. I didn’t implement it with 32-bpp bitmaps because it was a net speed loss for me, for some inexplicable reason.
I find it odd that either function differ in speed: memory is the bottleneck, not CPU. That and the Pentium II has write-combining hardware, which makes either routine just as fast: no need for either alignment checks or R-M-W operations on memory.
That said, the Allegro blitters do need to do some additional bookkeeping on blits because of subbitmaps, video bitmaps, non-linear bitmaps, etc. I would like to special case plain mem->mem copies and avoid all that overhead.
-----Original Message-----
I use a simple MMX function to move images from memory to video and the time to transfer is much faster than allegro (half of time). Now I'm using Allegro because I'm having a lot of problems with direct accesses to new hardwares; Before, I used my own Operating System. I don't know how the allegro make blit's, if it uses "rep movsd" or card accelerated commands. I have a card with VESA 3.0 and my routine is much faster than allegro. I create a routine called "repmovsq" it transfer 8 bytes (Quad Word) per cycle, just using:
mov ecx,(num of bytes) / 8 loopq: movq MM0,[esi] ; esi = source index movq [edi],MM0 ; edi = destination index add esi,8 add edi,8 dec ecx jnz loopq ; ; here the complement if (num of bytes) is not a multiple of 8 ;
; ; opcodes ; movq MM0,[esi] = 0x0F, 0x6F, 0x06 ; movq [edi],MM0 = 0x0F, 0x7F, 0x07 ;
My English is not better because I'am brazilian best Regards, Rogerio Uchoas Penchel
|
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |