Re: [AD] Color convertors

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


> No I haven't, I just shifted things around. This breaks parining on the
> pentiums (and K6) though :/  I get ~5% more speed on my P3 though (Thanks
> to the hardware instruction reordering).

Ok, but then why reordering everything, rewriting the comments... these
things are diff killer.

> Agreed. However, removing the new code will break BeOS and possibly QNX
> (depending on if Angelo changed the code there or not).

Adding it broke the Windows port !

> Did you patch the 32->24 code? If not, I'll do it tonight.

I didn't touch anything.

> In case you're wondering, it's done by reducing the amount of memory
> moves. The non-MMX code had 3 reads (for 24->32 bit) or 3 writes
> (32->24). I've reduced it to 1 read/write per 2 pixels (on average). This
> meant adding more code, so the loops take a few cycles more, but in the
> end it was worth it.

Quite right. I faced this trade-off when writing the two routines; but this
solution was nearly impossible to implement with only the legacy 386
registers while keeping the 4-pixel blocks, the pairing and a relatively
short code.

> The improvement will be more substantial on systems with a high cpu
> clock/memory clock ratio (like the Pentium 4 :)

I'm a little surprised: don't you plan to add SSE2 routines for the
Pentium 4, after adding SSE routines for the Pentium 3 of course ? ;-)

---
Eric Botcazou
ebotcazou@xxxxxxxxxx



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/