Re: [eigen] Parallelization of SparseMatrix * dense_matrix

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Dear Rasmus,

Am 13.09.21 um 17:52 schrieb Rasmus Munk Larsen:
> Hi Peter,
> 
> I would recommend that you hold off on this change for a bit. We are planning on moving the non-blocking threadpool from unsupported into core in the near future and use that as the basis for multi-threading without depending on
> OpenMP.

thanks for the hint.
So I'll pause playing with the parallelization at this point.

I also agree with Edward that some control on the parallelization could be helpful.
At least in our application this part will already be executed within parallel threads.

Still I'm puzzled why my for_each loop doesn't get parallelized,
but I can shift that problem. It's not that urged now, as I was just preforming
a feasibility study.

Best regards,
Peter


> 
> Rasmus
> 
> On Mon, Sep 13, 2021 at 5:21 AM Peter <list@xxxxxxxxxxxxxxxxx <mailto:list@xxxxxxxxxxxxxxxxx>> wrote:
> 
>     Dear all,
> 
>     I'm currently playing with the sparse matrix implementation within eigen,
>     namely Eigen::SparseMatrix<double, Eigen::ColMajor> combined with a self adjoint view.
> 
>     In my application I need the multiplication of a symmetric sparse matrix with a dense matrix,
>     where the dimension of the matrices are of the order of up to a few ten thousands.
> 
>     I tried to parallelize the dot product in SparseSelfAdjointView.h by copying a omp directive
>     from other parts in eigen, and I tried to parallelize the outer loop directly
>     with std::execution::par, see below.
> 
>     In summary, I do not see any effect of the parallelization.
>     Before digging deeper into it and building a minimal working example,
>     has someone  already achieved this?
> 
>     Could one actually directly call the corresponding MKL routine or are the internal storage schemes different?
> 
>     Best regards
>     Peter
> 
>     P.S.: That's what I tried as a diff to the current master branch:
> 
>     :~/Eigen/eigen/Eigen/src/SparseCore$ git diff SparseSelfAdjointView.h
>     diff --git a/Eigen/src/SparseCore/SparseSelfAdjointView.h b/Eigen/src/SparseCore/SparseSelfAdjointView.h
>     index 0302ef3a4..91e7d495b 100644
>     --- a/Eigen/src/SparseCore/SparseSelfAdjointView.h
>     +++ b/Eigen/src/SparseCore/SparseSelfAdjointView.h
>     @@ -10,6 +10,11 @@
>      #ifndef EIGEN_SPARSE_SELFADJOINTVIEW_H
>      #define EIGEN_SPARSE_SELFADJOINTVIEW_H
> 
>     +#include <iostream> /// only for debugging
>     +#include <boost/range/irange.hpp>
>     +
>     +#include <execution>
>     +
>      #include "./InternalHeaderCheck.h"
> 
>      namespace Eigen {
>     @@ -295,8 +300,19 @@ inline void sparse_selfadjoint_time_dense_product(const SparseLhsType& lhs, cons
>        SparseLhsTypeNested lhs_nested(lhs);
>        LhsEval lhsEval(lhs_nested);
> 
>     +  Eigen::initParallel();
>     +  Index threads = Eigen::nbThreads();
>     +
>     +  std::cout << "\ndot threads: "<< threads << " rhs-cols: " << rhs.cols() << std::endl;
>     +
>     +  // #pragma omp parallel for
>     +  // #pragma omp parallel for schedule(dynamic,(rhs.cols()+threads*4-1)/(threads*4)) num_threads(threads)
>     +  // for (Index k=0; k<rhs.cols(); ++k)
>     +
>     +  auto r = boost::irange(rhs.cols());
>     +
>     +  std::for_each(std::execution::par, r.begin(), r.end(),  [&](Index k)
>        // work on one column at once
>     -  for (Index k=0; k<rhs.cols(); ++k)
>        {
>          for (Index j=0; j<lhs.outerSize(); ++j)
>          {
>     @@ -330,6 +346,7 @@ inline void sparse_selfadjoint_time_dense_product(const SparseLhsType& lhs, cons
>              res.coeffRef(j,k) += alpha * i.value() * rhs.coeff(j,k);
>          }
>        }
>     +  );
>      }
> 
> 
> 



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/