Re: [AD] Color convertors |
[ Thread Index |
Date Index
| More lists.liballeg.org/allegro-developers Archives
]
> No I haven't, I just shifted things around. This breaks parining on the
> pentiums (and K6) though :/ I get ~5% more speed on my P3 though (Thanks
> to the hardware instruction reordering).
Ok, but then why reordering everything, rewriting the comments... these
things are diff killer.
> Agreed. However, removing the new code will break BeOS and possibly QNX
> (depending on if Angelo changed the code there or not).
Adding it broke the Windows port !
> Did you patch the 32->24 code? If not, I'll do it tonight.
I didn't touch anything.
> In case you're wondering, it's done by reducing the amount of memory
> moves. The non-MMX code had 3 reads (for 24->32 bit) or 3 writes
> (32->24). I've reduced it to 1 read/write per 2 pixels (on average). This
> meant adding more code, so the loops take a few cycles more, but in the
> end it was worth it.
Quite right. I faced this trade-off when writing the two routines; but this
solution was nearly impossible to implement with only the legacy 386
registers while keeping the 4-pixel blocks, the pairing and a relatively
short code.
> The improvement will be more substantial on systems with a high cpu
> clock/memory clock ratio (like the Pentium 4 :)
I'm a little surprised: don't you plan to add SSE2 routines for the
Pentium 4, after adding SSE routines for the Pentium 3 of course ? ;-)
---
Eric Botcazou
ebotcazou@xxxxxxxxxx