Re: [AD] Using memmove in blit()? |
[ Thread Index |
Date Index
| More lists.liballeg.org/allegro-developers Archives
]
aj wrote:
Anyone else care to do some benchmarks,
I'll do a few tests with memmove()
more results: including memmove, memcpy, and SSE2 stuff.
rectfill tests..
allegro's rectfill = 78,221 ticks
x86 32byte unrolling = 100,433 ticks
mmx (ptr arithmetic, no unrolling) = 85,064 ticks
mmx (ptr arithmetic, unrolling*4) = 69,464 ticks
mmx (array indexing, unrolling*4) = 69,609 ticks
sse2 (array indexing, unrolling*2) = 68,758 ticks
sse2 (array indexing, unrolling*4) = 69,187 ticks
sse2 (array indexing, unrolling*2, no-pollute-cache instruction) =
28,150 ticks
sse2 (ptr arithmetic, no unrolling, no-pollute-cache instruction) =
27,657 ticks
allegro's rectfill = 78,603 ticks (again, to confirm the others dont
have cache advantage)
clear_bitmap tests..
allegro's clear_bitmap = 76,574 ticks
memset(..,0,..) = 75,017 ticks
ZeroMemory = 75,668 ticks
sse2 (no unrolling, no-pollute-cache instruction) = 28,063 ticks
sse2, ptr arithmetic (unrolling*4, no-pollute-cache instruction) =
27,853 ticks
blit tests..
allegro's blit = 49,091 ticks
memmove = 48,674 ticks
memcpy = 49,209 ticks
SSE2, (no unrolling, no-pollute-cache instruction) = 27,978 ticks
SSE2, (unrolling*4, no-pollute-cache instruction) = 27,920 ticks