[Fwd: Re: [AD] improvements on alpha blending...]

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]



Sorry, I forgot the [AD] list doesn't work with 'reply' :)

-------- Original Message --------
Subject: Re: [AD] improvements on alpha blending...
Date: Tue, 31 Jul 2001 21:53:48 -0400
From: Bob <ohannessian@xxxxxxxxxx>
To: Javier González <xaviergonz@xxxxxxxxxx>
References: <OE50AKSwdHPDcQ61W1V00006abf@xxxxxxxxxx>

Javier González wrote:

> hi, i've been thinking
> why not speed-up allegro some alpha blending modes?


I think the best way to optimize the blenders would be to make them line
based instead of pixel based. This would cut down on the significant
overhead there is. Blender functions would take two line pointers and
blend them together. We could provide a dummy 'per-pixel' blender for
users who write custom per-pixel blenders if backwards compatibility is
an issue.
Finally, this would allow us to use MMX (or other special instruction
set) to work on several pixels at once.
I propose to write this patch.


> this would be as follows
>
> supposing the image has more than 20000 pixels (200 * 100 pixels for
> example)
> we could speed up things making on each draw call a table of 65kb (65536
> bytes) having the average of each two numbers ranging from 0 to 255 given
> the asked opacity


Are you talking about doing this for all blender types, or just the
trans blender?
Wouldn't that need 3 lookups + bit shifts per pixel? Also, the table
would need recalculation at every call to set_blender. There's a risk of
cache misses too, but I'd be interested in knowing how much of a speed
up that would give.


> since for an image (bigger than 20000 pixels, say 320x200 = 64000 * 3
> components (RGB) = calculations, that is 192000 calculations, where as with > this method we would only do 65536 calculations (no matter how big the image
> is)


Not quite: there's still the lookup to do, and that involves a bit of
computation too. I agree it would be less than the actual blender too.

> we should only have to do
> (where opacity is the given opacity from 0 to 255)
> for (int src = 0; src < 256; src++) {
>   for (int dst = 0; dst < 256; dst++) {
>     average[src][dst] = (src * (255 / opacity)) + (dst * (255 / (255 -
> opacity)));
>   }
> }


This is bad due to rounding errors, and the fact that you're dividing by
0. AFAIK, you'd need to use:

average[src][dst] = (src * opacity + (dst * (256 - opacity)) >> 8;

For the trans blender.


>
> where that could be optimized (and getting ride of the divide by 0 errors)
> to


On the ocntrary, optimizing the opacity = 0 case is trivial: simply make
the blender return on entry. Conversely, the 255 (256?) blender should
be converted to a masked_blit().


> and then with that table, say we wanna calculate the blending of
> pixels 255, 30, 10 and 20, 30, 150
> we sould only do
> r = average[255][20];
> g= average[30][30];
> b= average[10][150];
>
> giving an awesome speedup to big images
> (even a 250% on 320x200 images and even more on bigger images)
>
> then when the drawing is over, deallocate this buffer
> what do you think


We can keep it global. AFAIK, it can be shared in between various calls
to draw_trans_sprite, and should remain valid unless a call to
set_blender() was made.

As for optimizing the blenders, you should take a look at my page (url
below, programming section). I haven't done the trans blender, but I did
do the color add blender in 16bpp, and it's 3x faster than Allegro's on
my machine, and 4.5 times if I remove the multiplication. Keep in mind
that I wrote it in plain C :D


--
- Robert J Ohannessian
"Microsoft code is probably O(n^20)" (my CS prof)
http://pages.infinit.net/voidstar/


--
- Robert J Ohannessian
"Microsoft code is probably O(n^20)" (my CS prof)
http://pages.infinit.net/voidstar/



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/