2010/8/18 joel falcou
<joel.falcou@xxxxxx>
On 18/08/10 17:21, Benoit Jacob wrote:
I was going to ask about that. Do you think this makes for a significant performance improvement for real-world dynamic-size matrices (size at least 32x32) ?
For us it is useful to prevent false sharing on same row/col in openMP scenario.
hm, are you saying that you are aligning each row/col?
if yes, I wonder, because rows/cols are not too relevant to matrix products and other blocked algorithms.
Next question: can't cache line size vary even within a particular architecture? Although that probably doesn't matter since I guess it'll always be bigger than any alignment _requirement_ (it'll be at least 64 bytes, right? which is bigger than any SIMD instructions require, right?)
It varies yes. NT2 instalaltion detect it and put it into a NT2_CONFIG_ALIGNMENT macro to be used.
For us too, new simd platform == new compilation, but we support the use case that consists in compiling the same code N times for N different SIMD configs, and switching between these N paths at runtime. This requires Eigen data structures to have the same ABI across different SIMD configs (i.e.. SSE / no SSE), withing a given CPU arch (i.e. x86).
Ah yes then.
...and for the record, the other big reason: we want to be friendly to linux distros wrt binary libraries using Eigen. if libfoo uses Eigen, and you're a distro packager for x86, you want to ship only 1 package for libfoo. You decide whether you build it with SSE/AVX/nothing. Then You're free to do so because ABI is independent on that. Then app developers/packagers can link against libfoo regardless of how libfoo was built and regardless of their own simd settings. This is basically a requirement in environments, like linux distros, which rely heavily on system-wide shared libraries (rather than app bundles).
Benoit