Re: [eigen] Parallel matrix multiplication causes heap allocation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]




On Dec 18, 2016, at 9:12 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:



On Sun, Dec 18, 2016 at 7:28 AM, François Fayard <fayard@xxxxxxxxxxxxx> wrote:
Hi Rene,

I have skimed recently through the matrix multiplication code. In order to be cache friendly, Eigen performs many smaller matrix multiplication and it turns out that those smaller matrices are copied  and rearranged in memory to speed up the multiplication process. So malloc is expected to happen in matrix multiplication.

yes I confirm. 

As far as I know, other blas libraries such as OpenBLAS don't perform such copies. Is there any way to get rid of them in eigen?

nope, OpenBLAS does the same, and I believe MKL does too. I'm convinced that for large enough matrices it is impossible to reach such high performance without those copies/repacking. 


All good GEMM implementations copy into temporaries but they should not have to dynamically allocate them in every call. 

Some BLAS libraries allocate the buffer only once, during the first call. GotoBLAS used to do this. I don't know if OpenBLAS still does this, or in which cases. 

At one time, BLIS (cousin of OpenBLAS) used a static buffer that was part of the data segment, so malloc was not necessary to get the temporary buffers. I think it can dynamically allocate instead now but only once during the first call..

Jeff

gael
 

François



On 18 Dec 2016, at 01:06, Rene Ahlsdorf <ahlsdorf@xxxxxxxxxxxxxxxxxx> wrote:

Dear Eigen team,

first of all, thank you for all your effort to create such a great math library. I really love using it.

I’ve got a question about your parallelization routines. I want to calculate a parallel (omp based) matrix multiplication (result: 500 x 250 matrix) without allocating any new space in the meantime. So I have activated „Eigen::internal::set_is_malloc_allowed(false)“ to check that nothing goes wrong. However, my program crashes with the error message 
„Assertion failed: (is_malloc_allowed() && "heap allocation is forbidden (EIGEN_RUNTIME_NO_MALLOC is defined and g_is_malloc_allowed is false)"), function check_that_malloc_is_allowed, file /Users/xxx//libs/eigen/Eigen/src/Core/util/Memory.h, line 143.“. Is this behaviour desired? Should there be an allocation before doing parallel calculations? Or am I doing something wrong?

Thanks in advance.

Regards,
René Ahlsdorf

Eigen Version: 3.3.1 (commit f562a193118d)


Attached: Screenshot showing the last function calls 
<Screenshot 2016-12-18 01.01.42.png>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/