Re: [AD] About Elias' bug

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


> If the test surrounds a few instructions then I bet that the
> looP is left in Place.

Yes; it turns out it's the loop optimizer fault: it does a very bad job when
playing with the x86 registers, likely due to their scarcity.

for (y=0; y<h; y++) {
    s = bmp_read_line(src, s_y+y) + s_x*ssize;
    d = bmp_write_line(dest, d_y+y) + d_x*dsize;
    if (_color_conv & COLORCONV_DITHER_HI) {
         for (x=0; x<w; x++) {
             bmp_select(src);
             c = bmp_read##sbits(s);

             g = (c >> 1);
             b = getb##sbits(c);

             bmp_select(dest);
             bmp_write##dbits(d, makecol##dbits##_dither(r, g, b, x, y));

             s += ssize;
             d += dsize;
         }
     }
     else {
         for (x=0; x<w; x++) {
             bmp_select(src);
             c = bmp_read##sbits(s);

             r = getr##sbits(c);
             g = getg##sbits(c);
             b = getb##sbits(c);

             bmp_select(dest);
             bmp_write##dbits(d, makecol##dbits(r, g, b));

             s += ssize;
             d += dsize;
         }
     }
}

The COLORCONV_DITHER_HI bit is never set, so the 'else' branch is always
executed.
The unmodified code runs at 43.4 blits/second.
If I remove the test and keep only the 'else' branch, the code runs at 34
blits/second.
Even more amazing, if I only replace 'g = (c >> 1)' with 'g = c', the code
runs at 34 blits/second too.

It looks like the optimizer thinks it absolutely needs a register to perform
'c >> 1' so it reserves %ebx to do it in the 'if' branch. This also frees
%ebx for the 'else' branch and every shift operation is performed inside
registers (%edx for r, %eax for g and %ebx for b) in this branch.
Now, without the 'c >> 1' instruction, %ebx is not reserved in the 'if'
branch hence not available for the 'else' branch. This ends up with shift
operations for b done directly on the stack...

The best solution would be of course to write asm code by hand, but I'm a
little fed up writing asm color conversion code :-)
I'll try instead to gradually change the current C code.

Never trust compilers :-)

--
Eric Botcazou
ebotcazou@xxxxxxxxxx



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/