RE: [eigen] Developer Contribution Style and Technique

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi Gael,

In terms of openmp support for sparse products, I was thinking of just what you referred to, splitting up the data prior to use for repeated products. The way I envisioned it working was to generate a proxy/container matrix of sorts, which would contain several sparse submatrices, such that products could be split into one thread per submatrix. I'd want to do this for SparseVectors as well. One way to do this, which I think would still enable both the full matrix and the submatrices to be part of Eigen expressions, would be to first build the big matrix as a normal SparseMatrix, and then create sparse submatrices having as data contiguous sections of the full matrix's data, rather than the submatrices having their own independent data. Essentially instead of using Eigen blocks as submatrix proxies, we could use true submatrix objects with pre-determined memory. In expressions containing the big matrix, the big matrix can split the operations up between the submatrices, and the submatrices would still be able to participate as full actors in expressions. To define the submatrices, I think one would only need to add a constructor which takes a big matrix and a set of submatrix boundaries as arguments, and assigns a block of memory rather than creating it. For the big matrix it would seem more complex, as expressions involving it would have to coordinate the work/data distribution to submatrices, as well as the creation and partitioning of any resulting objects, but this should be doable within the framework of (openmp-parallel) loops. That too doesn't seem too bad, though you know the inner working of all this better than I do. Perhaps this could be implemented just using Eigen plug-ins, or am I being naive?

For the external linear solver support, I have a few in mind, MUMPS and the main Harwell Scientific Library (HSL) solvers with premade C-language support (MA27, MA57, MA86). I think they'd be good additions because they're open-source, free for non-commerical use, quite robust, and easy to build. They're also very commonly used as linear solver back-ends in optimization packages, e.g. IPOPT, and could easily be used as back-end solvers for any optimization modules developed by Eigen contributors. Furthermore, I've dealt with their developers repeatedly in the past, especially the HSL people at Cambridge, and they're very helpful and friendly, which makes for much easier collaboration. I'm quite sure they'd love to help a powerful open-source C++ linear algebra library like Eigen to support their solvers (which are currently mostly used with C and especially Fortran).

Those solvers are all direct solvers though, and thus there is an inherent (memory-based) limit on the size of matrix they can support, given their reliance on explicit factorization methods (this is the main weakness of CHOLMOD as well). As such, I would also be inclined to contribute support for some iterative (Krylov) solver libraries as well, as Krylov methods don't require explicit construction of the matrix in question, but rather only require the ability to evaluate matrix-vector products. I'd contribute support for external solvers, rather than use the pre-existing solvers in Eigen, because the external libraries offer a broad range of preconditioners as well, which are essential for any effective Krylov method, and are often much harder to write than the solver itself. Some libraries I would consider trying include Hypre, ITSOL/pARMS, Lis, and HIPS. Hypre and ITSOL/pARMS in particular seem a good start, as they explicitly separate preconditioning from iterative solution, so adding support for them would give Eigen users a large library of preconditioners to play with even if they didn't want to use those solvers; I'd probably start with ITSOL/pARMS support as it is much easier to install than Hypre. I do focus on easy installation, as the end-users of my bioinformatics packages are often biologists who are not especially computer-saavy, and thus are disinclined to install something complex like Petsc. The ITSOL/pARMS interface is also simpler than Hypre's, and like the HSL group, the authors of ITSOL/pARMS (Saad's group in Minnesota) are very friendly.

Of course, if the Eigen community has a preference as to which solvers I implement first, I'll happily focus on those initially, but in the end I'm going to implement most of those I've listed above, simply because I need them for my own projects using Eigen. I do a lot of statistical modelling and machine learning in my research, which depend on powerful large-scale non-linear optimization methods, and thus large-scale linear solvers. While many solvers are capable of handling systems of linear equations derived from linear algebraic problems, when using them for KKT-like equations from non-linear optimization, experience has shown that there is great (unpredictable) variability in their efficiency, and as such, anyone doing non-linear optimization should have a selection of linear solvers on hand.

Have a good Thursday,

-- Chris

________________________________________
From: Gael Guennebaud [gael.guennebaud@xxxxxxxxx]
Sent: Thursday, September 20, 2012 12:29 AM
To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Developer Contribution Style and Technique

Hi,

this page links to several ressources for developers:

http://eigen.tuxfamily.org/index.php?title=Category:Developer

In particular, there is a few coding rules there:

http://eigen.tuxfamily.org/index.php?title=Developer%27s_Corner#Eigen_hacking

Eigen's repository is not open for writing, so the best is open an
entry in our bug tracker for feature you start working on, and upload
patches there that we can review and push once it is ready.

This page will guide you for producing the patches using mercurial:

http://eigen.tuxfamily.org/index.php?title=Mercurial.


Regarding openmp support for sparse products, I have some local
changes doing so. However, on modern computers with NUMA, to achieve
good speedup, the data should ideally be split and copied into the
right memory location prior to the matrix products. This of course
only makes sense when a given matrix is used several times, so
typically for iterative solvers. It is not clear to me yet how to
design such a behavior though.


Regarding, "a broader range of interfaces to external linear solver
packages", could you be more specific about the libs you have in mind?

cheers,
Gael



On Wed, Sep 19, 2012 at 9:21 PM, Cowing-Zitron, Christopher
<ccowingzitron@xxxxxxxx> wrote:
> Hello,
>
> My name is Chris Cowing-Zitron. I work in bioninformatics at the University
> of California, San Diego. I'd like to contribute to the development of
> Eigen; a few of my ideas are adding openmp support for sparse products,
> adding a broader range of interfaces to external linear solver packages, and
> possibly adding basic data transfer support from Fortran via the map class.
> I do have two concerns though: Eigen uses such powerful C++ metaprogramming
> methods that I'm worried I might inadvertently mess something up (though of
> course I wouldn't submit any code until I was sure it compiled correctly),
> and also I don't want anyone to have to clean up my code to fit Eigen design
> and style standards. Is there any resource that I've missed that describes
> the standards for submission to Eigen? Or perhaps an archived email from the
> lists that covers this question? Thanks for your help, and thanks for Eigen.
>
> -- Chris





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/