Re: [eigen] Including Spectra within Eigen 3.4

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Including Spectra within Eigen 3.4
From: Bill Greene <w.h.greene@xxxxxxxxx>
Date: Sun, 26 Mar 2017 17:08:00 -0400
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=KmdonPA1gXTp3uvyqzGB4SdzGaDGpCFTkqDFNYFLq8U=; b=Gcn3MT8U7WWhPTuC4lDB2sxQHwwAdaIzJ8qdpSJJ+80QY7C/LcuiXMz1WrO2U5jaUc AtGkbMcpZ+oTfNuqrJ7upEy5wvMca6I1UCTJ82VCfkxrdr1RmkPngPOEiLwio4giKfRJ rBPWIOkktUo8GlMEPWkQ+zDxx4NK6XJNZq3es36OhuNA+4x3MwLmv8k6/kDe5l+DuMLE jZIB+P9SxWeE/HKfU61RqJbHnVRxbN+UHF1e876W6ZJBomlYNXtqjgO8pl/ZIo9zJB6p DnszL4vkFtuGuFXkHV3Ecddbvjg+F/D92FGcGtf9e1TfXjowLJ+u8A4q7t0SlOBswJHI s9QA==
I have done more thinking about this and also built some prototype software to
better understand the API issues (and also ARPACK).

It seems to me that how the user specifies shift-invert mode is more important
than how they specify smallest/largest, etc. I say this because it certainly
should be possible to calculate the smallest eigenvalues without shift-invert
even though in many cases this is non-optimal. The performance of this
combination
appears to be better than I expected so is a reasonable alternative to
factoring a large matrix.

In my prototype, the user selects shift-invert mode at runtime simply
by invoking
a constructor (or compute function) that includes a shift. Possibly this is
too "indirect" or obscure. The alternative, of course, is some kind of flag
(possibly a template parameter) indicating this. The smallest/largest
decision is
based on an independent runtime parameter.

The eigensolver classes in my prototype have three template
parameters-- MatrixType
symmetry/adjointness flag, and an operator class to implement the
required operations on
both A and B. There are 4 constructor/compute methods to implement the
combinations
of standard/generalized with or without a shift.

As I say, I built this on top of ARPACK (it is more or less like an
enhanced version of
ArpackGeneralizedSelfAdjointEigenSolver) so possibly some of the API
implications
are not appropriate for a header-only implementation. But it does have one
positive feature (in my opinion): it is quite easy to use for the
typical eigenproblems
I believe most users solve.

Bill

On Sat, Feb 25, 2017 at 7:19 AM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
>
>
> On Fri, Feb 24, 2017 at 7:49 PM, Bill Greene <w.h.greene@xxxxxxxxx> wrote:
>>
>> >If that's important, then we could extend the second parameter, e.g.:
>> >SymEigsSolver <SparseMatrix<double>, Smallest|RegularMode,
>> > MyCholmodOperator<...> > eig3(A,100);
>> >SymEigsSolver <SparseMatrix<double>, Largest|ShiftInvertMode,
>> > MyCholmodOperator<...> > eig3(A,100);
>>
>> Not sure I'm following here. Are you suggesting the second parameter
>> could be a bit-wise or to represent both
>> options?
>
>
> bit-wise, yes.
>
>>
>>
>> I think by default Smallest should imply ShiftInvertMode and Largest
>> should imply RegularMode. But am not
>> sure if that is what you're suggesting?
>
>
> I do agree with you, I was only suggesting that in case there exist use
> cases for which someone want to do weird combinations (as it is currently
> possible), it is still possible to extend my initial proposal to offer this
> flexibility. That's all.
>
>>
>> One other issue that I didn't see mentioned concerns the *second*
>> operator to deal with the M-matrix that is
>> needed for the generalized eigenproblem. Make that a defaulted fourth
>> template parameter?
>
>
> right.
>
>
>
> gael
>
>>
>>
>> >As this does not simplify our task, the main question is whether there
>> > are use cases for which the selection rule is not known >at compile-time?
>>
>> I personally can't think of one.
>>
>> Bill
>>
>>
>> On Thu, Feb 23, 2017 at 3:09 AM, Gael Guennebaud
>> <gael.guennebaud@xxxxxxxxx> wrote:
>> >
>> >
>> > On Thu, Feb 23, 2017 at 12:40 AM, Bill Greene <w.h.greene@xxxxxxxxx>
>> > wrote:
>> >>
>> >>
>> >> At the risk of complicating the discussion or asking something naive,
>> >> let me step back and ask a question about the currrent, dense Eigen
>> >> eigensolver
>> >> API. Why are there separate classes for standard and generalized
>> >> problems?
>> >> Couldn't that decision be made at run time based on having a
>> >> constructor
>> >>  (or compute function) with either one or two matrix arguments?  This
>> >> is, of course,
>> >> the MATLAB approach with the eig function.
>> >> If so, it seems like that would simply the API somewhat.
>> >
>> >
>> > This is because the API are different:
>> >
>> > - alphas(), betas() for the generalized case
>> > - pseudoEigen*() for the generic-real case.
>> >
>> > Also the generic case could be extended with matrix operators (e.g.
>> > power,
>> > product, etc.) whereas those does not make sense in the generalized
>> > case.
>> > Moreover, the storage requirement are not the same, a few pointers in
>> > the
>> > dynamic case, but waste of memory in the fixed-size case.
>> >
>> > What's more annoying IMO is that we have two versions of the generic
>> > case
>> > one for real only and one for both real and complexes... It also seems
>> > that
>> > many users do not realize that SelfAdjoint is a synonym of symmetric for
>> > real matrices and they thus miss the SelfAdjointEigenSolver.
>> >
>> >>
>> >> Regarding the "smallest"/"largest" option, isn't this also something
>> >> that
>> >> can be
>> >> a runtime option rather than a template parameter?  That is the way
>> >> that option is
>> >> handled in ArpackGeneralizedSelfAdjointEigenSolver.
>> >
>> >
>> > This is mostly a matter of compilation time. ARPACK is a precompiled
>> > binary
>> > library, so all paths are pre-compiled only once. For a header only
>> > library
>> > you probably want to compile only the paths that ou are using. For
>> > instance,
>> > following my proposal, if you adjust it to pass the selection rule at
>> > runtime:
>> >
>> > SymEigsSolver<SparseMatrix<double> > eig;
>> >
>> > eig.setSelectionRule(Largest,100);
>> > eig.compute(A);
>> >
>> > then you'll have to compile both the regular and shift-invert modes
>> > using
>> > the SparseSymMatProd and SparseGenRealShiftSolve operator whereas you
>> > only
>> > need the regular+ SparseSymMatProd path. Compiling a sparse linear
>> > solver
>> > for nothing is a huge waste of compilation time and increases your
>> > binary.
>> > This could still be possible if we can tolerate this drawback for
>> > "standard"
>> > users, and power users that care about these details and have more
>> > knowledge
>> > on the underlying algorithms could specify the second (and optionally
>> > the
>> > third) optional template arguments:
>> >
>> >  SymEigsSolver<SparseMatrix<double>, RegularMode> eig;
>> >
>> > then we can deduce that using SparseSymMatProd is a good default, and
>> > only
>> > instantiate one path.
>> >
>> > As this does not simplify our task, the main question is whether there
>> > are
>> > use cases for which the selection rule is not known at compile-time?
>> >
>> > gael
>> >
>> >
>> >>
>> >>
>> >> I'm assuming that if an option can be deferred to runtime without
>> >> degrading
>> >> performance, it will likely simplify the API.
>> >>
>> >> Bill Greene
>> >>
>> >> On Wed, Feb 22, 2017 at 1:37 PM, Rasmus Larsen <rmlarsen@xxxxxxxxxx>
>> >> wrote:
>> >> > This sounds great!! I've been meaning to contribute a rewrite of my
>> >> > own
>> >> > PROPACK library that can be viewed as an optimized version of ARPACK
>> >> > for
>> >> > the
>> >> > sparse SVD problem. The two would complement each other nicely, and
>> >> > make
>> >> > Eigen pretty powerful for sparse spectral problems. One of these
>> >> > days,
>> >> > I'll
>> >> > actually get around to it...
>> >> >
>> >> > On Wed, Feb 22, 2017 at 10:31 AM, Gael Guennebaud
>> >> > <gael.guennebaud@xxxxxxxxx> wrote:
>> >> >>
>> >> >>
>> >> >> Hi all,
>> >> >>
>> >> >> the last major item I'd like to include for the next 3.4 release is
>> >> >> Spectra (http://yixuan.cos.name/spectra) that is a set of iterative
>> >> >> solvers
>> >> >> for large (sparse or dense)  eigenvalue problems based on the
>> >> >> implicitly
>> >> >> restarted Arnoldi's method. It is essentially a re-write from
>> >> >> scratch
>> >> >> of the
>> >> >> ARPACK library on top of Eigen. Congrats to its author, Yixuan Qiu,
>> >> >> for
>> >> >> this
>> >> >> amazing work.
>> >> >>
>> >> >> I've already contacted Yixuan about his opinion on the integration
>> >> >> of
>> >> >> Spectra within Eigen, and he approved such a move. The general idea
>> >> >> is
>> >> >> to
>> >> >> make Spectra a new module in Eigen (Eigen/Spectra) and Yixuan would
>> >> >> remain
>> >> >> the main maintainer of this module.
>> >> >>
>> >> >>
>> >> >> Since Eigen guarantees API backward compatibility, the main work for
>> >> >> now
>> >> >> will be the stabilize the public API. This is were ANY OF YOU CAN
>> >> >> HELP.
>> >> >>
>> >> >>
>> >> >> Let's first have a brief look at the current API. The user has to
>> >> >> provide
>> >> >> the following compile-time information:
>> >> >>
>> >> >> - Choose the *right* solver class among the following six ones:
>> >> >> SymEigsSolver, GenEigsSolver, SymEigsShiftSolver,
>> >> >> GenEigsRealShiftSolver,
>> >> >> GenEigsComplexShiftSolver, SymGEigsSolver.
>> >> >>
>> >> >> - Pick the *right* operation among: DenseCholesky,
>> >> >> DenseGenComplexShiftSolve, DenseGenMatProd, DenseGenRealShiftSolve,
>> >> >> DenseSymMatProd, DenseSymShiftSolve, SparseCholesky,
>> >> >> SparseGenMatProd,
>> >> >> SparseGenRealShiftSolve, SparseRegularInverse, SparseSymMatProd,
>> >> >> SparseSymShiftSolve
>> >> >>
>> >> >> - Tell whether you want the *largest* or *smallest* eigenvalues.
>> >> >>
>> >> >> This already makes 6*12*2=144 configurations neglecting the numerous
>> >> >> variants of "largest" and "smallest".
>> >> >>
>> >> >> Of course, among them only a few really make sense. For instance,
>> >> >> requesting for the *smallest* eigenvalues is always going to be slow
>> >> >> (please
>> >> >> correct me if I'm wrong). If you want the smallest ones then you
>> >> >> must
>> >> >> use a
>> >> >> shift-and-invert solver with the associated "ShiftSolve" operation
>> >> >> and
>> >> >> request for the *largest*... yes you read it right, you need to
>> >> >> request
>> >> >> for
>> >> >> the *largest* even though you're going to get the smallest.
>> >> >>
>> >> >>
>> >> >> So I think that the high-level public API should be simplified so
>> >> >> that
>> >> >> the
>> >> >> user has to provide the minimal amount of information without risk
>> >> >> of
>> >> >> miss-configurations.
>> >> >>
>> >> >>
>> >> >> To initiate discussions, let me share my current reflexion on this
>> >> >> matter.
>> >> >> There might be several shortcomings, so please correct me if I'm
>> >> >> overseeing
>> >> >> some usages or technical difficulties.
>> >> >>
>> >> >> In a normal usage, the user should only has to provide the following
>> >> >> 3
>> >> >> compile-time information:
>> >> >>
>> >> >> 1 - Problem kind: self-adjoint, generic, self-adjoint-generalized,
>> >> >> generalized
>> >> >>
>> >> >> This makes four different kinds of problems with different
>> >> >> underlying
>> >> >> algorithms and different inputs/outputs. So, as in the
>> >> >> Eigen/Eigenvalues
>> >> >> module, this can be handled through 4 classes. Currently the last
>> >> >> one
>> >> >> (generalized) is not supported yet, so this makes only 3 solver
>> >> >> classes:
>> >> >>
>> >> >> SymEigsSolver, EigsSolver, SymGEigsSolver
>> >> >>
>> >> >>
>> >> >> 2 - The matrix abstraction: do I work on Matrix<>, on SparseMatrix<>
>> >> >> or
>> >> >> on
>> >> >> custom ones? This question is the same as for the iterative linear
>> >> >> solvers
>> >> >> (e.g., ConjugateGradient), and for consistency we could pass it as
>> >> >> the
>> >> >> first
>> >> >> template argument of the solver class. (I'm neglecting the multiple
>> >> >> lowest/largest variants)
>> >> >>
>> >> >>
>> >> >> 3 - Do I want the largest or smallest ones? This information has to
>> >> >> be
>> >> >> provided at compile-time so that the right operation and mode can be
>> >> >> automatically instantiated for the user (regular for the largest,
>> >> >> invert-shift for the smallest).
>> >> >>
>> >> >>
>> >> >> And that's all. Other parameters (number of eigenvalues, basis size,
>> >> >> initial shift, ...) can be specified at runtime. Say I want the 100
>> >> >> smallest
>> >> >> eigenvalues/vectors of a sparse symmetric real matrices, then all
>> >> >> I've
>> >> >> to
>> >> >> write would be:
>> >> >>
>> >> >> SymEigsSolver <SparseMatrix<double>, Smallest> eig(A,100);
>> >> >>
>> >> >> This will imply the usage of the build-in SparseSymShiftSolve
>> >> >> operation
>> >> >> based on SimplicialLDLT.
>> >> >>
>> >> >>
>> >> >> So far so good, but what if I want to use my own operator instead of
>> >> >> the
>> >> >> default one instantiated from the previous information? Offering
>> >> >> this
>> >> >> flexibility if important for custom/matrix-free solving or to choose
>> >> >> the
>> >> >> underlying linear solver in the shift-invert mode. This could be
>> >> >> accomplished through a third template parameters with a default
>> >> >> value
>> >> >> automatically determined from the previous ones as in the
>> >> >> LenvergMarquart<>
>> >> >> solver.
>> >> >>
>> >> >> Unfortunately, by doing so we end up with similar redundancies and
>> >> >> possible conflicts as in the current design. This is because the
>> >> >> operation
>> >> >> implies the matrix-kind, the selection rule (largest or smallest),
>> >> >> and
>> >> >> whether we have to use a regular or shift-invert mode.
>> >> >>
>> >> >> Perhaps it could be possible to play with C++11 variadic templates
>> >> >> to
>> >> >> allow for the two modes, e.g.:
>> >> >>
>> >> >> SymEigsSolver <SparseMatrix<double>, Smallest > eig1(A,100);
>> >> >>
>> >> >> SymEigsSolver<MyCholmodOperator<double> > eig2(A,100); // implies
>> >> >> lowest
>> >> >> and shift-invert mode
>> >> >>
>> >> >> Of course, MyCholmodOperator should provide some
>> >> >> compile-time-information
>> >> >> to tell whether it is applying A^1 or A.
>> >> >>
>> >> >> Using variadic template for that seems over-complicated tough.
>> >> >>
>> >> >> Moreover, it is also unclear to me whether the different definitions
>> >> >> of
>> >> >> smallest/largest (i.e., signed, magnitude, real part, imaginary
>> >> >> part)
>> >> >> impacts the algorithmic choices? Can this detail be decided at
>> >> >> runtime?
>> >> >> Does
>> >> >> it influence the choice of the mode and operation?
>> >> >>
>> >> >> So in the end, some redundancy is perhaps not that bad:
>> >> >>
>> >> >> SymEigsSolver <SparseMatrix<double>, Smallest,
>> >> >> MyCholmodOperator<...> >
>> >> >> eig3(A,100);
>> >> >>
>> >> >> because only power users will start playing this third template
>> >> >> parameter.
>> >> >> With this proposal, it is still not possible to explicitly choose
>> >> >> between
>> >> >> the regular and shift-invert modes. If that's important, then we
>> >> >> could
>> >> >> extend the second parameter, e.g.:
>> >> >>
>> >> >> SymEigsSolver <SparseMatrix<double>, Smallest|RegularMode,
>> >> >> MyCholmodOperator<...> > eig3(A,100);
>> >> >>
>> >> >> SymEigsSolver <SparseMatrix<double>, Largest|ShiftInvertMode,
>> >> >> MyCholmodOperator<...> > eig3(A,100);
>> >> >>
>> >> >> Then we get all the flexibility of the current design, but with much
>> >> >> less
>> >> >> risk to shoot his own foot (IMO).
>> >> >>
>> >> >> In contrast to the current design, I would also make sure that
>> >> >> Smallest
>> >> >> and Largest always refer to the output eigenvalues returned by the
>> >> >> eigenvalues() method (which are not necessarily the eigenvalues of
>> >> >> the
>> >> >> input
>> >> >> matrix A, e.g.,  if you silently pass an invert operation while
>> >> >> asking
>> >> >> for
>> >> >> Largest|Regular).
>> >> >>
>> >> >> cheers,
>> >> >> gael.
>> >> >
>> >> >
>> >>
>> >>
>> >
>>
>>
>
Follow-Ups:
- Re: [eigen] Including Spectra within Eigen 3.4
  - From: Oleg Shirokobrod
Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] heads-up: ARM prefetch fixes
Next by Date: [eigen] Eigen::Array initialization with long double and mpreal type
Previous by thread: Re: [eigen] heads-up: ARM prefetch fixes
Next by thread: Re: [eigen] Including Spectra within Eigen 3.4
Mail converted by MHonArc 2.6.19+
http://listengine.tuxfamily.org/