Thank you for opening this discussion on the public mailing list.
So let's discuss about the public API, which currently is not very
convenient as already noticed by others. Issues are:
(i1) - Storing MKL's handle in SparseMatrix breaks ABI and does not
sounds very generic.
- We need a way to control:
(i2) - which operations are going to be analyzed/optimized,
(i3) - and specify the 'expected_calls' parameter.
In order to discuss these issues, let's consider the following typical
pattern: (e.g., non-linear optimization, eigenvalues, ...)
SimplicialLDLT<SparseMatrix<double> > llt(A);
while(...) {
...
x = llt.solve(b);
...
}
Here the triangular L factor is going to be used for triangular and
transposed-triangular solves dozens to hundreds of time but only the
user of SimplicialLDLT knowns that, not SimplicialLDLT, nor
SparseMatrix. Moreover, the user does not own the SparseMatrix that we
want to analyze/optimize for. Other patterns are likely easier to
handle, so let's focus on it for now.
Regarding (i1), I would suggest to introduce a new type, say
MklSparseMatrix<> that would enhance SparseMatrix<> through
inheritance. Then for (i2) and (i3) we could imagine something like:
MklSparseMatrix::beginAnalysis(Index expected_calls) const {
// turn *this to compressed mode
// create handle
// store expected_calls
// enable recording mode
}
MklSparseMatrix::endAnalysis() const {
// disable recording mode
// [optional] call mkl_sparse_optimize
}
All states in MklSparseMatrix would be mutable.
Between a pair of beginAnalysis/endAnalysis each call to a supported
operation would trigger calls to
mkl_sparse_set_*_hint()/mkl_sparse_optimize.
Optionally, we could even add a "dryrun" mode for which no operation
would be performed, only calls to mkl_sparse_set_*_hint() and then
mkl_sparse_optimize would be called in endAnalysis(). This way
mkl_sparse_optimize() would be called only once.
And that's it. Our example would look-like:
SimplicialLDLT<MklSparseMatrix<double> > llt(A);
int it=0;
while(...) {
...
if(it==0) llt.matrixL().beginAnalysis(100);
x = llt.solve(b);
if(it==0) llt.matrixL().endAnalysis();
...
++it;
}
or using a "dry-run" mode:
SimplicialLDLT<MklSparseMatrix<double> > llt(A);
llt.matrixL().beginAnalysis(100, DryRun);
x = llt.solve(b); // permutation and division by the diagonal matrix D
would still be performed, but calls to actual triangular solves would
be by-passed
llt.matrixL().endAnalysis();
while(...) {
...
x = llt.solve(b);
...
}
If someone directly deal with the factor L, then we could follow the
same pattern or copy the SparseMatrix factor L to a MklSparseMatrix:
SimplicialLLT<SparseMatrix<double> > llt(A);
MklSparseMatrix L(llt.matrixL());
L.beginAnalysis(100,DryRun);
y = L.triangularView<Lower>() * x;
L.endAnalysis();
while(...) {
...
y = L.triangularView<Lower>() * x;
...
}
This design in quite general and expendable to any sparse-optimizers,
even built-in ones in the future.
In contrast to the current proposal, only selected operations would be
passed to MKL (need to use a MklSparseMatrix + begin/end recording
phase).
What do you think?
gael
On Tue, Apr 3, 2018 at 11:39 PM, Zhukova, Maria
<maria.zhukova@xxxxxxxxx <mailto:maria.zhukova@xxxxxxxxx>> wrote:
Hello Eigen community,
My name is Maria Zhukova and I’m a software development engineer
at Intel ®
MKL Sparse team.
My team is interested in contributing into Eigen, so I’ve
investigated our
possibilities and so far this is what I have:
Eigen support different operations for sparse matrices stored in
CSR and CSC
format which can be implemented on a basis of IE SpBLAS kernels
(please,
refer to
https://software.intel.com/en-us/mkl-developer-reference-c-inspector-executor-sparse-blas-routines
<https://software.intel.com/en-us/mkl-developer-reference-c-inspector-executor-sparse-blas-routines>
for the general idea of interfaces)
, basically we want to implement calls to our IE SpBLAS into next
operations:____
SparseMatrix + SparseMatrix (mkl_sparse_?_add)
SparseMatrix * DenseVector (mkl_sparse_?_mv)____
SparseMatrix * DenseMatrix (mkl_sparse_?_mm)____
SparseMatrix * SparseMatrix (mkl_sparse_spmm),
and Triangular solve (mkl_sparse_?_trsv).____
I’ve already started with implementation of
sparse_time_dense_impl_mkl
kernel which is based on mkl_sparse_?_mv (included in patch).____
This is how it will look like for user:
*#include <Eigen/SpBLASSupport> *<-- *NEW:* IE SpBLAS include
module ____
void main () {
SparseMatrix<double, RowMajor> A;
Matrix<double, Dynamic, 1> x, y;
A.makeCompressed(); /* Convert matrix A into CSR/CSC format */
*A.createSparseHandle();*/* *NEW*: is used to create handle
required for all
IE SpBLAS routines */____
// support of IE SpBLAS is here
y = beta*y + alpha*A*x; /* call to mkl_sparse_?_mv with operation =
SPARSE_OPERATION_NON_TRANSPOSE */
y = beta*y + alpha*A.transpose()*x; /* call to mkl_sparse_?_mv with
operation = SPARSE_OPERATION_TRANSPOSE */
y = beta*y + alpha*A.adjoint()*x; /* call to mkl_sparse_?_mv with
operation
= SPARSE_OPERATION_CONJUGATE_TRANSPOSE */____
*A.destroySparseHandle();* /* *NEW*: is used to delete created
handle */
}____
__ __
I’ve attached a draft patch including all necessary changes and
would like
to hear your feedback.
Please, let me know if you have any questions and comments.____
__ __
Best regards,
Maria____
__ __
__ __
__ __