Re: [AD] _aligned_malloc()

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


Elias Pschernig wrote:
On Sat, 2005-07-23 at 16:06 +1000, AJ wrote:

A quick search thru the allegro src code, i find lots of malloc() calls.

I have mentioned this before, i'd like to replace them with _aligned_malloc() as this yields greater performance on the newer (heavily alignment dependant) CPUs.

As i dont think anyones going to agree to a simple find-n-replace; how about replacing them with a macro or a function call, that could have build specific code to allow me to use _aligned_malloc() and default back to normal malloc() for non-specific builds.


Yes, since quite probably, it will need to be different on different
platforms. I don't think my libc here has an _aligned_malloc function.
And from what I remember, there's already a proposal among the A5
documents which has such an malloc layer, so definitely a good idea.

Do you have some benchmarks which shows the improvement?


i have no evidence of it yet; but i have read plenty of documents that suggest it. Most of the SSE(2) stuff requires aligned data, else a performance hit should be expected. I dont think i'd have much trouble in finding that text in the AMD/Intel CPU software design guides published by AMD/Intel, which should be enough to convince you of its nessesity.



here is a snippet from  http://bmagic.sourceforge.net/bmsse2opt.html
"SIMD enabled compilers usually provide functions for aligned memory allocation. For Intel C++ use the _mm_malloc/_mm_free intrinsics to allocate and free aligned blocks of memory. In MSVC v.7 _aligned_malloc/_aligned_free serves the same purpose. Don't make a mistake by calling general purpose free to deallocate pointer created by _mm_malloc. It will cause unpredictable behavior."


 Maybe could do
something like for the asm code on allegro.cc, where it got clear that
the C version is faster on newer CPUs. If we get something similiar with
malloc, then we should make the change.






Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/