Re: [eigen] Multi-threading in array operation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi Ghislain. Here are some tips from a neighbor.

MatrixXd A = MatrixXd::Zero(100,100);
MatrixXd B = MatrixXd::Ones(100,100);

Those operations will most likely bound my memory allocation (time to do a malloc and time for the “first touch” of the array). Most likely, you won’t gain anything from multithreading.

MatrixXd C = A.array() + B.array(); // element-wise addition

Those operations are memory bound. On most computer, you won’t gain anything from multithreading. Suppose that the array is so big that it does not fit into the last level of cache (L3). On a laptop:
- Bandwidth: 25GB/s, so (assuming you are using doubles), you can feed (25 / (2 * 8) = 1.5 billion elements per second).
- Computing power: One core at 2 GHz can issue 2 billion addition per second (1 per cycle)
So, you’ll be bandwidth limited, and multi-threading will be useless. And I did not even count the access to C.

On a workstation, with dual socket, it will be useful to use multithreading, but with careful threads pining. One thing which can be useful with large arrays would be to use streaming stores to C (which does not use the bandwidth for C).

MatrixXd D = A.array() / B.array(); // element-wise division

Here, you might benefit from multithreading as division is a slow operation.

François Fayard
Founder & Consultant - Inside Loop
Applied Mathematics & High Performance Computing
Tel: +33 (0)6 01 44 06 93
Web: www.insideloop.io

On Apr 24, 2017, at 5:54 PM, Ghislain Durif <gd.dev@xxxxxxxxxxxxxxx> wrote:

Hi,

I am an enthusiastic user of Eigen and I have a question regarding multi-threading support in array operations.

The dedicated webpage (https://eigen.tuxfamily.org/dox/TopicMultiThreading.html) states that, currently, the following algorithms can make use of multi-threading:

   general dense matrix - matrix products
   PartialPivLU
   row-major-sparse * dense vector/matrix products
   ConjugateGradient with Lower|Upper as the UpLo template parameter.
   BiCGSTAB with a row-major sparse matrix format.
   LeastSquaresConjugateGradient

Is there any project in the future to extend multi-threading support to array operation in Eigen ? basically matrix element-wise operations like:



It would be great if it was parallelized. I have a lots of these element-wise operations in my code, and it would be heavier to redefine all of these with OpenMP.

If it is not planned, I will probably work on implementing such feature in a near future, I just don't want to spend too much time on it if the work is already done or near done.

Thanks in advance,
Best regards,

Ghislain

--------------------------
Research engineer THOTH TEAM
INRIA Grenoble Alpes (France)



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/