Re: [AD] improvements on alpha blending...

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


> Javier González wrote:
>
> > hi, i've been thinking
> > why not speed-up allegro some alpha blending modes?
>
>
> I think the best way to optimize the blenders would be to make them line
> based instead of pixel based. This would cut down on the significant
> overhead there is. Blender functions would take two line pointers and
> blend them together. We could provide a dummy 'per-pixel' blender for
> users who write custom per-pixel blenders if backwards compatibility is
> an issue.
> Finally, this would allow us to use MMX (or other special instruction
> set) to work on several pixels at once.
> I propose to write this patch.

Yes, this could be applied to the method i propose as well (if i have
understood
it correctly)

> > this would be as follows
> >
> > supposing the image has more than 20000 pixels (200 * 100 pixels for
> > example)
> > we could speed up things making on each draw call a table of 65kb (65536
> > bytes) having the average of each two numbers ranging from 0 to 255
given
> > the asked opacity
>
>
> Are you talking about doing this for all blender types, or just the
> trans blender?
Well, at first the transblender, and for any other transblender methods that
support it

> Wouldn't that need 3 lookups + bit shifts per pixel? Also, the table
> would need recalculation at every call to set_blender. There's a risk of
> cache misses too, but I'd be interested in knowing how much of a speed
> up that would give.

Basically yes, but I think 3 lookups + 3 bits shift per pixel
is always faster than 6 multiplications, 3 substractions, 3 bit shifts and 6
divisions
using decimals per pixel :)

> > since for an image (bigger than 20000 pixels, say 320x200 = 64000 * 3
> > components (RGB) = calculations, that is 192000 calculations, where as
with
> > this method we would only do 65536 calculations (no matter how big the
image
> > is)
>
>
> Not quite: there's still the lookup to do, and that involves a bit of
> computation too. I agree it would be less than the actual blender too.

yes, only when using set_trans_blender (once if you blit with the same trans
blender)


> On the ocntrary, optimizing the opacity = 0 case is trivial: simply make
> the blender return on entry. Conversely, the 255 (256?) blender should
> be converted to a masked_blit().
you are right



> > and then with that table, say we wanna calculate the blending of
> > pixels 255, 30, 10 and 20, 30, 150
> > we sould only do
> > r = average[255][20];
> > g= average[30][30];
> > b= average[10][150];
> >
> > giving an awesome speedup to big images
> > (even a 250% on 320x200 images and even more on bigger images)
> >
> > then when the drawing is over, deallocate this buffer
> > what do you think
>
>
> We can keep it global. AFAIK, it can be shared in between various calls
> to draw_trans_sprite, and should remain valid unless a call to
> set_blender() was made.
certainly, i was thinking of draw_trans_sprite, but it is true it can be
done even better
per each set_trans_blender call


> As for optimizing the blenders, you should take a look at my page (url
> below, programming section). I haven't done the trans blender, but I did
> do the color add blender in 16bpp, and it's 3x faster than Allegro's on
> my machine, and 4.5 times if I remove the multiplication. Keep in mind
> that I wrote it in plain C :D

Then should we go and give a try on optimizing blenders if nobody complains?
;)



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/