Re: [eigen] Intel (R) MKL IE SpBLAS support in Eigen

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Intel (R) MKL IE SpBLAS support in Eigen
From: Edward Lam <edward@xxxxxxxxxx>
Date: Thu, 5 Apr 2018 09:31:57 -0400
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sidefx.com; s=google; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=JrxCuO3z7QEQ4hStGygOXjzvMZwwUmtjVVcxwvGJAXk=; b=IZQjgEelhI3PGWPGkjZsaNgJQdu3IU/1E33GvyUBwMgDmt+CbA7H7MvzgbBmsCFNUg BW9ZIddFeQiQ3qmLOXWTN3dGI6On1kV05B6nSaySo/0mW0LXZFr7WDLAmXzZx8t8KKw8 nwfYztRDuwodG7H4WIlOvmPRruR5/ZcEJ7/d2AR8db6nO48xnGkuT0UTDMl+UmCfy3wd +O93+R0VAJK55YEicNGNNAIJU+oyF2XLhIMnnbZx+78tQvjLMvUbqVVNUDnNgSIFZoDG jQYd6MMdld4i5fuCn3YpOqbDupcwUZDiJ6vaaFuIufhJk4oxlWkHtMV2pCReqDG9Pckd f/XA==

Would it be useful to incorporate lambda's into the interface to avoid begin/endpairs? So from the user side, we would write code like this instead:


1) Analyze and run

    SimplicialLDLT<MklSparseMatrix<double>> llt(A);
    int it = 0;
    for (int it = 0; ...; ++it) {
        if (it == 0)
            llt.matrixL().sparseAnalyzeAndRun(100, [&] { llt.solve(b); });
        else
            llt.solve(b);
        // ...
    }

2) Analyze only

    SimplicialLDLT<MklSparseMatrix<double>> llt(A);
    llt.matrixL().sparseAnalyze(100, [&] { llt.solve(b); });
    // and solve as usual

For more complicated algorithms, one can always outline the lambda and pass itinto the analysis.


Cheers,
-Edward

On 4/5/2018 8:01 AM, Gael Guennebaud wrote:

Thank you for opening this discussion on the public mailing list.
So let's discuss about the public API, which currently is not very convenient asalready noticed by others. Issues are:
(i1) - Storing MKL's handle in SparseMatrix breaks ABI and does not sounds verygeneric.
      - We need a way to control:
(i2)   - which operations are going to be analyzed/optimized,
(i3)   - and specify the 'expected_calls' parameter.
In order to discuss these issues, let's consider the following typical pattern:(e.g., non-linear optimization, eigenvalues, ...)
SimplicialLDLT<SparseMatrix<double> > llt(A);

while(...) {
     ...
x = llt.solve(b);
...
}
Here the triangular L factor is going to be used for triangular andtransposed-triangular solves dozens to hundreds of time but only the user ofSimplicialLDLT knowns that, not SimplicialLDLT, nor SparseMatrix. Moreover, theuser does not own the SparseMatrix that we want to analyze/optimize for. Otherpatterns are likely easier to handle, so let's focus on it for now.
Regarding (i1), I would suggest to introduce a new type, say MklSparseMatrix<>that would enhance SparseMatrix<> through inheritance. Then for (i2) and (i3) wecould imagine something like:
MklSparseMatrix::beginAnalysis(Index expected_calls) const {
// turn *this to compressed mode
// create handle
// store expected_calls
// enable recording mode
}
MklSparseMatrix::endAnalysis() const {
// disable recording mode
// [optional] call mkl_sparse_optimize
}

All states in MklSparseMatrix would be mutable.
Between a pair of beginAnalysis/endAnalysis each call to a supported operationwould trigger calls to mkl_sparse_set_*_hint()/mkl_sparse_optimize.Optionally, we could even add a "dryrun" mode for which no operation would beperformed, only calls to mkl_sparse_set_*_hint() and then mkl_sparse_optimizewould be called in endAnalysis(). This way mkl_sparse_optimize() would be calledonly once.
And that's it. Our example would look-like:


SimplicialLDLT<MklSparseMatrix<double> > llt(A);
int it=0;
while(...) {
     ...
     if(it==0) llt.matrixL().beginAnalysis(100);
x = llt.solve(b);
if(it==0) llt.matrixL().endAnalysis();
...
++it;
}

or using a "dry-run" mode:

SimplicialLDLT<MklSparseMatrix<double> > llt(A);

llt.matrixL().beginAnalysis(100, DryRun);
x = llt.solve(b); // permutation and division by the diagonal matrix D wouldstill be performed, but calls to actual triangular solves would be by-passed
llt.matrixL().endAnalysis();

while(...) {
     ...
     x = llt.solve(b);
...
}
If someone directly deal with the factor L, then we could follow the samepattern or copy the SparseMatrix factor L to a MklSparseMatrix:
SimplicialLLT<SparseMatrix<double> > llt(A);

MklSparseMatrix L(llt.matrixL());
L.beginAnalysis(100,DryRun);
y = L.triangularView<Lower>() * x;
L.endAnalysis();
while(...) {
     ...
y = L.triangularView<Lower>() * x;
...
}
This design in quite general and expendable to any sparse-optimizers, evenbuilt-in ones in the future.
In contrast to the current proposal, only selected operations would be passed toMKL (need to use a MklSparseMatrix + begin/end recording phase).
What do you think?


gael
On Tue, Apr 3, 2018 at 11:39 PM, Zhukova, Maria <maria.zhukova@xxxxxxxxx<mailto:maria.zhukova@xxxxxxxxx>> wrote:
    Hello Eigen community,

    My name is Maria Zhukova and I’m a software development engineer at Intel ®
    MKL Sparse team.

    My team is interested in contributing into Eigen, so I’ve investigated our
    possibilities and so far this is what I have:
    Eigen support different operations for sparse matrices stored in CSR and CSC
    format which can be implemented on a basis of IE SpBLAS kernels (please,
    refer to
    https://software.intel.com/en-us/mkl-developer-reference-c-inspector-executor-sparse-blas-routines
    <https://software.intel.com/en-us/mkl-developer-reference-c-inspector-executor-sparse-blas-routines>
    for the general idea of interfaces)
    , basically we want to implement calls to our IE SpBLAS into next
    operations:____

                     SparseMatrix + SparseMatrix (mkl_sparse_?_add)
                     SparseMatrix * DenseVector  (mkl_sparse_?_mv)____

                     SparseMatrix * DenseMatrix   (mkl_sparse_?_mm)____

                     SparseMatrix * SparseMatrix  (mkl_sparse_spmm),
    and Triangular solve (mkl_sparse_?_trsv).____

    I’ve already started with implementation of sparse_time_dense_impl_mkl
    kernel which is based on mkl_sparse_?_mv (included in patch).____

    This is how it will look like for user:
    *#include <Eigen/SpBLASSupport> *<-- *NEW:* IE SpBLAS include module ____

    void main () {
       SparseMatrix<double, RowMajor> A;
      Matrix<double, Dynamic, 1> x, y;

       A.makeCompressed(); /* Convert matrix A into CSR/CSC format */
    *A.createSparseHandle();*/* *NEW*: is used to create handle required for all
    IE SpBLAS routines */____

    // support of IE SpBLAS is here
    y = beta*y + alpha*A*x; /* call to mkl_sparse_?_mv with operation =
    SPARSE_OPERATION_NON_TRANSPOSE */
    y = beta*y + alpha*A.transpose()*x; /* call to mkl_sparse_?_mv with
    operation = SPARSE_OPERATION_TRANSPOSE */
    y = beta*y + alpha*A.adjoint()*x; /* call to mkl_sparse_?_mv with operation
    = SPARSE_OPERATION_CONJUGATE_TRANSPOSE */____

    *A.destroySparseHandle();* /* *NEW*: is used to delete created handle */
    }____

    __ __

    I’ve attached a draft patch including all necessary changes and would like
    to hear your feedback.
    Please, let me know if you have any questions and comments.____

    __ __

    Best regards,
    Maria____

    __ __

    __ __

    __ __

Follow-Ups:
- Re: [eigen] Intel (R) MKL IE SpBLAS support in Eigen
  - From: Christoph Hertzberg

References:
- [eigen] Intel (R) MKL IE SpBLAS support in Eigen
  - From: Zhukova, Maria
- Re: [eigen] Intel (R) MKL IE SpBLAS support in Eigen
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Intel (R) MKL IE SpBLAS support in Eigen
Next by Date: [eigen] Bug in compilation with latest cmake / no gcc
Previous by thread: Re: [eigen] Intel (R) MKL IE SpBLAS support in Eigen
Next by thread: Re: [eigen] Intel (R) MKL IE SpBLAS support in Eigen

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/