Re: [eigen] Parallel matrix multiplication causes heap allocation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


> On 19 Dec 2016, at 18:31, Jeff Hammond <jeff.science@xxxxxxxxx> wrote:

> More than just that, OpenMP runtimes are nontrivial beasts to control and any multithreaded performance data that does not include a complete list of compiler and runtime versions, affinity information, complete processor details, and OS+distro version should be viewed with skepticism.
> 
> For example, most OpenMP runtimes do not set affinity by default, and I've seen this reduce performance by ~2x in DGEMM, and once affinity is enabled, breadth- vs depth-first placement makes a large difference in some cases.

For my tests with the MKL, I have used the MKL multithreaded with TBB which gives consistent results.

I never managed to get consistent results with OpenMP, even with KMP_AFFINITY set to compact or scatter.



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/