Re: [eigen] Including Spectra within Eigen 3.4

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Including Spectra within Eigen 3.4
From: Bill Greene <w.h.greene@xxxxxxxxx>
Date: Sat, 1 Apr 2017 15:44:16 -0400
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=kMjZng0GKzVLx2A7dobzQtpSojzMLc8zuISvSipwq+4=; b=IDc9Ov2iLaLRtfyzxFW+v3EWLo9ynnO0ySw8FxTx4JWhfRuQBpwlU2U/xRKdKEcWa4 B1QlmLAZZrAesQ+2/WYU5EDNNAqYmpmoFyp58RzKeWntme1+txeu/kp7GYfN+Sn9By4t YPy8TB9O61WvoYupRlLZADviUwSXO2W1M3EhME+tXT2Q7mFNGHrMgOVvjrl6vC6608sN 4KcrCMnVDWpxpmvqAaO2+vIcwEGTYCfKEDPFJQX5pqDCdZu6rVTTMpXFwGx6+oDFtgr7 28Pt//nXboiHoSVDY5S6GjWrt6tNhZ+RWsi+BGE5y/gNpJyBJKwTgLkTm0QPyxVnzC+W kg7A==
I made these changes to your main.cpp (to tighten the convergence tolerance)

    typedef oia::maths::Tsvd<double> Tsvd;
    Tsvd tsvd;
    Tsvd::Options opts;
    opts.tolerance = 1e-12;
    opts.maximum_number_of_iterations = 1000;
    tsvd.set_options(opts);

and get singular values equivalent to the Eigen and Matlab values.

On Fri, Mar 31, 2017 at 4:55 AM, Oleg Shirokobrod
<oleg.shirokobrod@xxxxxxxxx> wrote:
> Hello all,
>
> I send this message to the eigen list because Spectra is going to be a part
> of the Eigen library. I am interested in truncated SVD. So, Yixuan, I ported
> your svds.cpp code from RSpectra to C++ template code. This code is
> attached. I   have added perform_tprod function to four MatOp classes, made
> MatProd template argument in SVDOp classes in order to avoid using virtual
> functions, and refactored truncated svd functions. When I compared results
> of this implementation with matlab svds functions, it will give the same
> result for different matrix, but matrix with degenerated singular
> values.There is my test function in the attachment. Generated matrix A is
> actually a sparse matrix, but I deal with it as a dense one. All its
> singular values are doubly degenerated. First I run Eigen JacobiSVD
> decomposition. Here you are first singular values given by Eigen JacobiSVD
> 41.9962
> 41.9962
> 41.9848
> 41.9848
> 41.9659
> 41.9659
> 41.9396
> 41.9396
> 41.9059
> 41.9059
> 41.8649
> 41.8649
> 41.8169
> 41.8169
>
> Matlab svds function gives first 7 singular values
> 41.9962
> 41.9962
> 41.9848
> 41.9848
> 41.9659
> 41.9659
> 41.9396
>
> Refactored Spectra code gives only different singular values
> 41.9962
> 41.9848
> 41.9659
> 41.9396
> 41.9059
> 41.8649
> 41.8169
>
> Why there are different results?
>
> Thanks,
>
> Oleg Shirokobrod
>
>
> On Mon, Mar 27, 2017 at 12:08 AM, Bill Greene <w.h.greene@xxxxxxxxx> wrote:
>>
>> I have done more thinking about this and also built some prototype
>> software to
>> better understand the API issues (and also ARPACK).
>>
>> It seems to me that how the user specifies shift-invert mode is more
>> important
>> than how they specify smallest/largest, etc. I say this because it
>> certainly
>> should be possible to calculate the smallest eigenvalues without
>> shift-invert
>> even though in many cases this is non-optimal. The performance of this
>> combination
>> appears to be better than I expected so is a reasonable alternative to
>> factoring a large matrix.
>>
>> In my prototype, the user selects shift-invert mode at runtime simply
>> by invoking
>> a constructor (or compute function) that includes a shift. Possibly this
>> is
>> too "indirect" or obscure. The alternative, of course, is some kind of
>> flag
>> (possibly a template parameter) indicating this. The smallest/largest
>> decision is
>> based on an independent runtime parameter.
>>
>> The eigensolver classes in my prototype have three template
>> parameters-- MatrixType
>> symmetry/adjointness flag, and an operator class to implement the
>> required operations on
>> both A and B. There are 4 constructor/compute methods to implement the
>> combinations
>> of standard/generalized with or without a shift.
>>
>> As I say, I built this on top of ARPACK (it is more or less like an
>> enhanced version of
>> ArpackGeneralizedSelfAdjointEigenSolver) so possibly some of the API
>> implications
>> are not appropriate for a header-only implementation. But it does have one
>> positive feature (in my opinion): it is quite easy to use for the
>> typical eigenproblems
>> I believe most users solve.
>>
>> Bill
>>
>> On Sat, Feb 25, 2017 at 7:19 AM, Gael Guennebaud
>> <gael.guennebaud@xxxxxxxxx> wrote:
>> >
>> >
>> > On Fri, Feb 24, 2017 at 7:49 PM, Bill Greene <w.h.greene@xxxxxxxxx>
>> > wrote:
>> >>
>> >> >If that's important, then we could extend the second parameter, e.g.:
>> >> >SymEigsSolver <SparseMatrix<double>, Smallest|RegularMode,
>> >> > MyCholmodOperator<...> > eig3(A,100);
>> >> >SymEigsSolver <SparseMatrix<double>, Largest|ShiftInvertMode,
>> >> > MyCholmodOperator<...> > eig3(A,100);
>> >>
>> >> Not sure I'm following here. Are you suggesting the second parameter
>> >> could be a bit-wise or to represent both
>> >> options?
>> >
>> >
>> > bit-wise, yes.
>> >
>> >>
>> >>
>> >> I think by default Smallest should imply ShiftInvertMode and Largest
>> >> should imply RegularMode. But am not
>> >> sure if that is what you're suggesting?
>> >
>> >
>> > I do agree with you, I was only suggesting that in case there exist use
>> > cases for which someone want to do weird combinations (as it is
>> > currently
>> > possible), it is still possible to extend my initial proposal to offer
>> > this
>> > flexibility. That's all.
>> >
>> >>
>> >> One other issue that I didn't see mentioned concerns the *second*
>> >> operator to deal with the M-matrix that is
>> >> needed for the generalized eigenproblem. Make that a defaulted fourth
>> >> template parameter?
>> >
>> >
>> > right.
>> >
>> >
>> >
>> > gael
>> >
>> >>
>> >>
>> >> >As this does not simplify our task, the main question is whether there
>> >> > are use cases for which the selection rule is not known >at
>> >> > compile-time?
>> >>
>> >> I personally can't think of one.
>> >>
>> >> Bill
>> >>
>> >>
>> >> On Thu, Feb 23, 2017 at 3:09 AM, Gael Guennebaud
>> >> <gael.guennebaud@xxxxxxxxx> wrote:
>> >> >
>> >> >
>> >> > On Thu, Feb 23, 2017 at 12:40 AM, Bill Greene <w.h.greene@xxxxxxxxx>
>> >> > wrote:
>> >> >>
>> >> >>
>> >> >> At the risk of complicating the discussion or asking something
>> >> >> naive,
>> >> >> let me step back and ask a question about the currrent, dense Eigen
>> >> >> eigensolver
>> >> >> API. Why are there separate classes for standard and generalized
>> >> >> problems?
>> >> >> Couldn't that decision be made at run time based on having a
>> >> >> constructor
>> >> >>  (or compute function) with either one or two matrix arguments?
>> >> >> This
>> >> >> is, of course,
>> >> >> the MATLAB approach with the eig function.
>> >> >> If so, it seems like that would simply the API somewhat.
>> >> >
>> >> >
>> >> > This is because the API are different:
>> >> >
>> >> > - alphas(), betas() for the generalized case
>> >> > - pseudoEigen*() for the generic-real case.
>> >> >
>> >> > Also the generic case could be extended with matrix operators (e.g.
>> >> > power,
>> >> > product, etc.) whereas those does not make sense in the generalized
>> >> > case.
>> >> > Moreover, the storage requirement are not the same, a few pointers in
>> >> > the
>> >> > dynamic case, but waste of memory in the fixed-size case.
>> >> >
>> >> > What's more annoying IMO is that we have two versions of the generic
>> >> > case
>> >> > one for real only and one for both real and complexes... It also
>> >> > seems
>> >> > that
>> >> > many users do not realize that SelfAdjoint is a synonym of symmetric
>> >> > for
>> >> > real matrices and they thus miss the SelfAdjointEigenSolver.
>> >> >
>> >> >>
>> >> >> Regarding the "smallest"/"largest" option, isn't this also something
>> >> >> that
>> >> >> can be
>> >> >> a runtime option rather than a template parameter?  That is the way
>> >> >> that option is
>> >> >> handled in ArpackGeneralizedSelfAdjointEigenSolver.
>> >> >
>> >> >
>> >> > This is mostly a matter of compilation time. ARPACK is a precompiled
>> >> > binary
>> >> > library, so all paths are pre-compiled only once. For a header only
>> >> > library
>> >> > you probably want to compile only the paths that ou are using. For
>> >> > instance,
>> >> > following my proposal, if you adjust it to pass the selection rule at
>> >> > runtime:
>> >> >
>> >> > SymEigsSolver<SparseMatrix<double> > eig;
>> >> >
>> >> > eig.setSelectionRule(Largest,100);
>> >> > eig.compute(A);
>> >> >
>> >> > then you'll have to compile both the regular and shift-invert modes
>> >> > using
>> >> > the SparseSymMatProd and SparseGenRealShiftSolve operator whereas you
>> >> > only
>> >> > need the regular+ SparseSymMatProd path. Compiling a sparse linear
>> >> > solver
>> >> > for nothing is a huge waste of compilation time and increases your
>> >> > binary.
>> >> > This could still be possible if we can tolerate this drawback for
>> >> > "standard"
>> >> > users, and power users that care about these details and have more
>> >> > knowledge
>> >> > on the underlying algorithms could specify the second (and optionally
>> >> > the
>> >> > third) optional template arguments:
>> >> >
>> >> >  SymEigsSolver<SparseMatrix<double>, RegularMode> eig;
>> >> >
>> >> > then we can deduce that using SparseSymMatProd is a good default, and
>> >> > only
>> >> > instantiate one path.
>> >> >
>> >> > As this does not simplify our task, the main question is whether
>> >> > there
>> >> > are
>> >> > use cases for which the selection rule is not known at compile-time?
>> >> >
>> >> > gael
>> >> >
>> >> >
>> >> >>
>> >> >>
>> >> >> I'm assuming that if an option can be deferred to runtime without
>> >> >> degrading
>> >> >> performance, it will likely simplify the API.
>> >> >>
>> >> >> Bill Greene
>> >> >>
>> >> >> On Wed, Feb 22, 2017 at 1:37 PM, Rasmus Larsen <rmlarsen@xxxxxxxxxx>
>> >> >> wrote:
>> >> >> > This sounds great!! I've been meaning to contribute a rewrite of
>> >> >> > my
>> >> >> > own
>> >> >> > PROPACK library that can be viewed as an optimized version of
>> >> >> > ARPACK
>> >> >> > for
>> >> >> > the
>> >> >> > sparse SVD problem. The two would complement each other nicely,
>> >> >> > and
>> >> >> > make
>> >> >> > Eigen pretty powerful for sparse spectral problems. One of these
>> >> >> > days,
>> >> >> > I'll
>> >> >> > actually get around to it...
>> >> >> >
>> >> >> > On Wed, Feb 22, 2017 at 10:31 AM, Gael Guennebaud
>> >> >> > <gael.guennebaud@xxxxxxxxx> wrote:
>> >> >> >>
>> >> >> >>
>> >> >> >> Hi all,
>> >> >> >>
>> >> >> >> the last major item I'd like to include for the next 3.4 release
>> >> >> >> is
>> >> >> >> Spectra (http://yixuan.cos.name/spectra) that is a set of
>> >> >> >> iterative
>> >> >> >> solvers
>> >> >> >> for large (sparse or dense)  eigenvalue problems based on the
>> >> >> >> implicitly
>> >> >> >> restarted Arnoldi's method. It is essentially a re-write from
>> >> >> >> scratch
>> >> >> >> of the
>> >> >> >> ARPACK library on top of Eigen. Congrats to its author, Yixuan
>> >> >> >> Qiu,
>> >> >> >> for
>> >> >> >> this
>> >> >> >> amazing work.
>> >> >> >>
>> >> >> >> I've already contacted Yixuan about his opinion on the
>> >> >> >> integration
>> >> >> >> of
>> >> >> >> Spectra within Eigen, and he approved such a move. The general
>> >> >> >> idea
>> >> >> >> is
>> >> >> >> to
>> >> >> >> make Spectra a new module in Eigen (Eigen/Spectra) and Yixuan
>> >> >> >> would
>> >> >> >> remain
>> >> >> >> the main maintainer of this module.
>> >> >> >>
>> >> >> >>
>> >> >> >> Since Eigen guarantees API backward compatibility, the main work
>> >> >> >> for
>> >> >> >> now
>> >> >> >> will be the stabilize the public API. This is were ANY OF YOU CAN
>> >> >> >> HELP.
>> >> >> >>
>> >> >> >>
>> >> >> >> Let's first have a brief look at the current API. The user has to
>> >> >> >> provide
>> >> >> >> the following compile-time information:
>> >> >> >>
>> >> >> >> - Choose the *right* solver class among the following six ones:
>> >> >> >> SymEigsSolver, GenEigsSolver, SymEigsShiftSolver,
>> >> >> >> GenEigsRealShiftSolver,
>> >> >> >> GenEigsComplexShiftSolver, SymGEigsSolver.
>> >> >> >>
>> >> >> >> - Pick the *right* operation among: DenseCholesky,
>> >> >> >> DenseGenComplexShiftSolve, DenseGenMatProd,
>> >> >> >> DenseGenRealShiftSolve,
>> >> >> >> DenseSymMatProd, DenseSymShiftSolve, SparseCholesky,
>> >> >> >> SparseGenMatProd,
>> >> >> >> SparseGenRealShiftSolve, SparseRegularInverse, SparseSymMatProd,
>> >> >> >> SparseSymShiftSolve
>> >> >> >>
>> >> >> >> - Tell whether you want the *largest* or *smallest* eigenvalues.
>> >> >> >>
>> >> >> >> This already makes 6*12*2=144 configurations neglecting the
>> >> >> >> numerous
>> >> >> >> variants of "largest" and "smallest".
>> >> >> >>
>> >> >> >> Of course, among them only a few really make sense. For instance,
>> >> >> >> requesting for the *smallest* eigenvalues is always going to be
>> >> >> >> slow
>> >> >> >> (please
>> >> >> >> correct me if I'm wrong). If you want the smallest ones then you
>> >> >> >> must
>> >> >> >> use a
>> >> >> >> shift-and-invert solver with the associated "ShiftSolve"
>> >> >> >> operation
>> >> >> >> and
>> >> >> >> request for the *largest*... yes you read it right, you need to
>> >> >> >> request
>> >> >> >> for
>> >> >> >> the *largest* even though you're going to get the smallest.
>> >> >> >>
>> >> >> >>
>> >> >> >> So I think that the high-level public API should be simplified so
>> >> >> >> that
>> >> >> >> the
>> >> >> >> user has to provide the minimal amount of information without
>> >> >> >> risk
>> >> >> >> of
>> >> >> >> miss-configurations.
>> >> >> >>
>> >> >> >>
>> >> >> >> To initiate discussions, let me share my current reflexion on
>> >> >> >> this
>> >> >> >> matter.
>> >> >> >> There might be several shortcomings, so please correct me if I'm
>> >> >> >> overseeing
>> >> >> >> some usages or technical difficulties.
>> >> >> >>
>> >> >> >> In a normal usage, the user should only has to provide the
>> >> >> >> following
>> >> >> >> 3
>> >> >> >> compile-time information:
>> >> >> >>
>> >> >> >> 1 - Problem kind: self-adjoint, generic,
>> >> >> >> self-adjoint-generalized,
>> >> >> >> generalized
>> >> >> >>
>> >> >> >> This makes four different kinds of problems with different
>> >> >> >> underlying
>> >> >> >> algorithms and different inputs/outputs. So, as in the
>> >> >> >> Eigen/Eigenvalues
>> >> >> >> module, this can be handled through 4 classes. Currently the last
>> >> >> >> one
>> >> >> >> (generalized) is not supported yet, so this makes only 3 solver
>> >> >> >> classes:
>> >> >> >>
>> >> >> >> SymEigsSolver, EigsSolver, SymGEigsSolver
>> >> >> >>
>> >> >> >>
>> >> >> >> 2 - The matrix abstraction: do I work on Matrix<>, on
>> >> >> >> SparseMatrix<>
>> >> >> >> or
>> >> >> >> on
>> >> >> >> custom ones? This question is the same as for the iterative
>> >> >> >> linear
>> >> >> >> solvers
>> >> >> >> (e.g., ConjugateGradient), and for consistency we could pass it
>> >> >> >> as
>> >> >> >> the
>> >> >> >> first
>> >> >> >> template argument of the solver class. (I'm neglecting the
>> >> >> >> multiple
>> >> >> >> lowest/largest variants)
>> >> >> >>
>> >> >> >>
>> >> >> >> 3 - Do I want the largest or smallest ones? This information has
>> >> >> >> to
>> >> >> >> be
>> >> >> >> provided at compile-time so that the right operation and mode can
>> >> >> >> be
>> >> >> >> automatically instantiated for the user (regular for the largest,
>> >> >> >> invert-shift for the smallest).
>> >> >> >>
>> >> >> >>
>> >> >> >> And that's all. Other parameters (number of eigenvalues, basis
>> >> >> >> size,
>> >> >> >> initial shift, ...) can be specified at runtime. Say I want the
>> >> >> >> 100
>> >> >> >> smallest
>> >> >> >> eigenvalues/vectors of a sparse symmetric real matrices, then all
>> >> >> >> I've
>> >> >> >> to
>> >> >> >> write would be:
>> >> >> >>
>> >> >> >> SymEigsSolver <SparseMatrix<double>, Smallest> eig(A,100);
>> >> >> >>
>> >> >> >> This will imply the usage of the build-in SparseSymShiftSolve
>> >> >> >> operation
>> >> >> >> based on SimplicialLDLT.
>> >> >> >>
>> >> >> >>
>> >> >> >> So far so good, but what if I want to use my own operator instead
>> >> >> >> of
>> >> >> >> the
>> >> >> >> default one instantiated from the previous information? Offering
>> >> >> >> this
>> >> >> >> flexibility if important for custom/matrix-free solving or to
>> >> >> >> choose
>> >> >> >> the
>> >> >> >> underlying linear solver in the shift-invert mode. This could be
>> >> >> >> accomplished through a third template parameters with a default
>> >> >> >> value
>> >> >> >> automatically determined from the previous ones as in the
>> >> >> >> LenvergMarquart<>
>> >> >> >> solver.
>> >> >> >>
>> >> >> >> Unfortunately, by doing so we end up with similar redundancies
>> >> >> >> and
>> >> >> >> possible conflicts as in the current design. This is because the
>> >> >> >> operation
>> >> >> >> implies the matrix-kind, the selection rule (largest or
>> >> >> >> smallest),
>> >> >> >> and
>> >> >> >> whether we have to use a regular or shift-invert mode.
>> >> >> >>
>> >> >> >> Perhaps it could be possible to play with C++11 variadic
>> >> >> >> templates
>> >> >> >> to
>> >> >> >> allow for the two modes, e.g.:
>> >> >> >>
>> >> >> >> SymEigsSolver <SparseMatrix<double>, Smallest > eig1(A,100);
>> >> >> >>
>> >> >> >> SymEigsSolver<MyCholmodOperator<double> > eig2(A,100); // implies
>> >> >> >> lowest
>> >> >> >> and shift-invert mode
>> >> >> >>
>> >> >> >> Of course, MyCholmodOperator should provide some
>> >> >> >> compile-time-information
>> >> >> >> to tell whether it is applying A^1 or A.
>> >> >> >>
>> >> >> >> Using variadic template for that seems over-complicated tough.
>> >> >> >>
>> >> >> >> Moreover, it is also unclear to me whether the different
>> >> >> >> definitions
>> >> >> >> of
>> >> >> >> smallest/largest (i.e., signed, magnitude, real part, imaginary
>> >> >> >> part)
>> >> >> >> impacts the algorithmic choices? Can this detail be decided at
>> >> >> >> runtime?
>> >> >> >> Does
>> >> >> >> it influence the choice of the mode and operation?
>> >> >> >>
>> >> >> >> So in the end, some redundancy is perhaps not that bad:
>> >> >> >>
>> >> >> >> SymEigsSolver <SparseMatrix<double>, Smallest,
>> >> >> >> MyCholmodOperator<...> >
>> >> >> >> eig3(A,100);
>> >> >> >>
>> >> >> >> because only power users will start playing this third template
>> >> >> >> parameter.
>> >> >> >> With this proposal, it is still not possible to explicitly choose
>> >> >> >> between
>> >> >> >> the regular and shift-invert modes. If that's important, then we
>> >> >> >> could
>> >> >> >> extend the second parameter, e.g.:
>> >> >> >>
>> >> >> >> SymEigsSolver <SparseMatrix<double>, Smallest|RegularMode,
>> >> >> >> MyCholmodOperator<...> > eig3(A,100);
>> >> >> >>
>> >> >> >> SymEigsSolver <SparseMatrix<double>, Largest|ShiftInvertMode,
>> >> >> >> MyCholmodOperator<...> > eig3(A,100);
>> >> >> >>
>> >> >> >> Then we get all the flexibility of the current design, but with
>> >> >> >> much
>> >> >> >> less
>> >> >> >> risk to shoot his own foot (IMO).
>> >> >> >>
>> >> >> >> In contrast to the current design, I would also make sure that
>> >> >> >> Smallest
>> >> >> >> and Largest always refer to the output eigenvalues returned by
>> >> >> >> the
>> >> >> >> eigenvalues() method (which are not necessarily the eigenvalues
>> >> >> >> of
>> >> >> >> the
>> >> >> >> input
>> >> >> >> matrix A, e.g.,  if you silently pass an invert operation while
>> >> >> >> asking
>> >> >> >> for
>> >> >> >> Largest|Regular).
>> >> >> >>
>> >> >> >> cheers,
>> >> >> >> gael.
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >
>> >>
>> >>
>> >
>>
>>
>
Follow-Ups:
- Re: [eigen] Including Spectra within Eigen 3.4
  - From: Oleg Shirokobrod
Messages sorted by: [ date | thread ]
Next by Date: Re: [eigen] Including Spectra within Eigen 3.4
Next by thread: Re: [eigen] Including Spectra within Eigen 3.4
Mail converted by MHonArc 2.6.19+
http://listengine.tuxfamily.org/