Re: [eigen] Including Spectra within Eigen 3.4

On Thu, Feb 23, 2017 at 12:40 AM, Bill Greene <w.h.greene@xxxxxxxxx> wrote:

At the risk of complicating the discussion or asking something naive,
let me step back and ask a question about the currrent, dense Eigen eigensolver
API. Why are there separate classes for standard and generalized problems?
Couldn't that decision be made at run time based on having a constructor
(or compute function) with either one or two matrix arguments? This
is, of course,
the MATLAB approach with the eig function.
If so, it seems like that would simply the API somewhat.

This is because the API are different:

- alphas(), betas() for the generalized case

- pseudoEigen*() for the generic-real case.

Also the generic case could be extended with matrix operators (e.g. power, product, etc.) whereas those does not make sense in the generalized case. Moreover, the storage requirement are not the same, a few pointers in the dynamic case, but waste of memory in the fixed-size case.

What's more annoying IMO is that we have two versions of the generic case one for real only and one for both real and complexes... It also seems that many users do not realize that SelfAdjoint is a synonym of symmetric for real matrices and they thus miss the SelfAdjointEigenSolver.

Regarding the "smallest"/"largest" option, isn't this also something that can be
a runtime option rather than a template parameter? That is the way
that option is
handled in ArpackGeneralizedSelfAdjointEigenSolver.

This is mostly a matter of compilation time. ARPACK is a precompiled binary library, so all paths are pre-compiled only once. For a header only library you probably want to compile only the paths that ou are using. For instance, following my proposal, if you adjust it to pass the selection rule at runtime:

SymEigsSolver<SparseMatrix<double> > eig;

eig.setSelectionRule(Largest,100);

eig.compute(A);

then you'll have to compile both the regular and shift-invert modes using the SparseSymMatProd and SparseGenRealShiftSolve operator whereas you only need the regular+ SparseSymMatProd path. Compiling a sparse linear solver for nothing is a huge waste of compilation time and increases your binary. This could still be possible if we can tolerate this drawback for "standard" users, and power users that care about these details and have more knowledge on the underlying algorithms could specify the second (and optionally the third) optional template arguments:

SymEigsSolver<SparseMatrix<double>, RegularMode> eig;

then we can deduce that using SparseSymMatProd is a good default, and only instantiate one path.

As this does not simplify our task, the main question is whether there are use cases for which the selection rule is not known at compile-time?

gael

I'm assuming that if an option can be deferred to runtime without degrading
performance, it will likely simplify the API.

Bill Greene

On Wed, Feb 22, 2017 at 1:37 PM, Rasmus Larsen <rmlarsen@xxxxxxxxxx> wrote:
> This sounds great!! I've been meaning to contribute a rewrite of my own
> PROPACK library that can be viewed as an optimized version of ARPACK for the
> sparse SVD problem. The two would complement each other nicely, and make
> Eigen pretty powerful for sparse spectral problems. One of these days, I'll
> actually get around to it...
>
> On Wed, Feb 22, 2017 at 10:31 AM, Gael Guennebaud
> <gael.guennebaud@xxxxxxxxxx> wrote:
>>
>>
>> Hi all,
>>
>> the last major item I'd like to include for the next 3.4 release is
>> Spectra (http://yixuan.cos.name/spectra) that is a set of iterative solvers
>> for large (sparse or dense) eigenvalue problems based on the implicitly
>> restarted Arnoldi's method. It is essentially a re-write from scratch of the
>> ARPACK library on top of Eigen. Congrats to its author, Yixuan Qiu, for this
>> amazing work.
>>
>> I've already contacted Yixuan about his opinion on the integration of
>> Spectra within Eigen, and he approved such a move. The general idea is to
>> make Spectra a new module in Eigen (Eigen/Spectra) and Yixuan would remain
>> the main maintainer of this module.
>>
>>
>> Since Eigen guarantees API backward compatibility, the main work for now
>> will be the stabilize the public API. This is were ANY OF YOU CAN HELP.
>>
>>
>> Let's first have a brief look at the current API. The user has to provide
>> the following compile-time information:
>>
>> - Choose the *right* solver class among the following six ones:
>> SymEigsSolver, GenEigsSolver, SymEigsShiftSolver, GenEigsRealShiftSolver,
>> GenEigsComplexShiftSolver, SymGEigsSolver.
>>
>> - Pick the *right* operation among: DenseCholesky,
>> DenseGenComplexShiftSolve, DenseGenMatProd, DenseGenRealShiftSolve,
>> DenseSymMatProd, DenseSymShiftSolve, SparseCholesky, SparseGenMatProd,
>> SparseGenRealShiftSolve, SparseRegularInverse, SparseSymMatProd,
>> SparseSymShiftSolve
>>
>> - Tell whether you want the *largest* or *smallest* eigenvalues.
>>
>> This already makes 6*12*2=144 configurations neglecting the numerous
>> variants of "largest" and "smallest".
>>
>> Of course, among them only a few really make sense. For instance,
>> requesting for the *smallest* eigenvalues is always going to be slow (please
>> correct me if I'm wrong). If you want the smallest ones then you must use a
>> shift-and-invert solver with the associated "ShiftSolve" operation and
>> request for the *largest*... yes you read it right, you need to request for
>> the *largest* even though you're going to get the smallest.
>>
>>
>> So I think that the high-level public API should be simplified so that the
>> user has to provide the minimal amount of information without risk of
>> miss-configurations.
>>
>>
>> To initiate discussions, let me share my current reflexion on this matter.
>> There might be several shortcomings, so please correct me if I'm overseeing
>> some usages or technical difficulties.
>>
>> In a normal usage, the user should only has to provide the following 3
>> compile-time information:
>>
>> 1 - Problem kind: self-adjoint, generic, self-adjoint-generalized,
>> generalized
>>
>> This makes four different kinds of problems with different underlying
>> algorithms and different inputs/outputs. So, as in the Eigen/Eigenvalues
>> module, this can be handled through 4 classes. Currently the last one
>> (generalized) is not supported yet, so this makes only 3 solver classes:
>>
>> SymEigsSolver, EigsSolver, SymGEigsSolver
>>
>>
>> 2 - The matrix abstraction: do I work on Matrix<>, on SparseMatrix<> or on
>> custom ones? This question is the same as for the iterative linear solvers
>> (e.g., ConjugateGradient), and for consistency we could pass it as the first
>> template argument of the solver class. (I'm neglecting the multiple
>> lowest/largest variants)
>>
>>
>> 3 - Do I want the largest or smallest ones? This information has to be
>> provided at compile-time so that the right operation and mode can be
>> automatically instantiated for the user (regular for the largest,
>> invert-shift for the smallest).
>>
>>
>> And that's all. Other parameters (number of eigenvalues, basis size,
>> initial shift, ...) can be specified at runtime. Say I want the 100 smallest
>> eigenvalues/vectors of a sparse symmetric real matrices, then all I've to
>> write would be:
>>
>> SymEigsSolver <SparseMatrix<double>, Smallest> eig(A,100);
>>
>> This will imply the usage of the build-in SparseSymShiftSolve operation
>> based on SimplicialLDLT.
>>
>>
>> So far so good, but what if I want to use my own operator instead of the
>> default one instantiated from the previous information? Offering this
>> flexibility if important for custom/matrix-free solving or to choose the
>> underlying linear solver in the shift-invert mode. This could be
>> accomplished through a third template parameters with a default value
>> automatically determined from the previous ones as in the LenvergMarquart<>
>> solver.
>>
>> Unfortunately, by doing so we end up with similar redundancies and
>> possible conflicts as in the current design. This is because the operation
>> implies the matrix-kind, the selection rule (largest or smallest), and
>> whether we have to use a regular or shift-invert mode.
>>
>> Perhaps it could be possible to play with C++11 variadic templates to
>> allow for the two modes, e.g.:
>>
>> SymEigsSolver <SparseMatrix<double>, Smallest > eig1(A,100);
>>
>> SymEigsSolver<MyCholmodOperator<double> > eig2(A,100); // implies lowest
>> and shift-invert mode
>>
>> Of course, MyCholmodOperator should provide some compile-time-information
>> to tell whether it is applying A^1 or A.
>>
>> Using variadic template for that seems over-complicated tough.
>>
>> Moreover, it is also unclear to me whether the different definitions of
>> smallest/largest (i.e., signed, magnitude, real part, imaginary part)
>> impacts the algorithmic choices? Can this detail be decided at runtime? Does
>> it influence the choice of the mode and operation?
>>
>> So in the end, some redundancy is perhaps not that bad:
>>
>> SymEigsSolver <SparseMatrix<double>, Smallest, MyCholmodOperator<...> >
>> eig3(A,100);
>>
>> because only power users will start playing this third template parameter.
>> With this proposal, it is still not possible to explicitly choose between
>> the regular and shift-invert modes. If that's important, then we could
>> extend the second parameter, e.g.:
>>
>> SymEigsSolver <SparseMatrix<double>, Smallest|RegularMode,
>> MyCholmodOperator<...> > eig3(A,100);
>>
>> SymEigsSolver <SparseMatrix<double>, Largest|ShiftInvertMode,
>> MyCholmodOperator<...> > eig3(A,100);
>>
>> Then we get all the flexibility of the current design, but with much less
>> risk to shoot his own foot (IMO).
>>
>> In contrast to the current design, I would also make sure that Smallest
>> and Largest always refer to the output eigenvalues returned by the
>> eigenvalues() method (which are not necessarily the eigenvalues of the input
>> matrix A, e.g., if you silently pass an invert operation while asking for
>> Largest|Regular).
>>
>> cheers,
>> gael.
>
>