Re: [eigen] RFC: BasicSVD class

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


The fast mode struggles when the singular values are really close.  That is the main reason it is not the default mode.  The SelfAdjointEigenSolver mode seems to do just as well as JacobiSVD in this challenging case (sometimes better).

You can test this by playing with the arguments of the test utility I attached in the previous email.
    arg 1 : m (rows of matrix X)
    arg 2 : n (cols of matrix X)
    arg 3 : k (number of dimensions to test)
    arg 4 :  spread : sets the difference in singular values of the matrix to be solved (see example below) 


Here's an octave/matlab representation of the test matrix that gets created at the beginning of the testType<T> function:

    m=500;n=400; %problem dimensions
    k=3;        % number of dimensions that will be larger in magnitude than the rest
    spread=.1;                % the range of the first K singular values

    [U,junk,V]=svds(randn(m,n),k);   %need orthonormal vectors U,V  (the C++ code uses GramSchmidt )

    S=diag( linspace( 2 + spread , 2 , k ) );  %descending singular values

    X1 = U*S*V';  % low rank matrix
    R=randn(m,n)/sqrt(m);     % completely random matrix
    R = R - U*U'*R;    % remove columnspace projection
    R = R - R*V*V';     % remove rowspace projection
    X= X1 + R;        % The test matrix with known vectors and distinct values for the first k dimensions, the rest of the dimensions are filled with randomnesss at a lower level.

octave:13> svds(X,5)
ans =

   2.1000    <-- corresponds to U(:,1)*V(:,1)'
   2.0500    <--   U(:,2)*V(:,2)'
   2.0000    <--   U(:,3)*V(:,3)'
   1.8941    <-- the remaining dimensions are from random R
   1.8594


Note:  the singular values associated with the random dimensions will take on values between 1 and 2 depending on how rectangular the matrix is.    Note m==n would give you an occasional testing glitch since the SVD should find one of the random dimensions instead of a vector in U,V.




On 12/07/2012 09:35 AM, Gael Guennebaud wrote:
Indeed, this simple algorithm seems pretty fast and it could even be even faster once we have fully optimized SelfAdjointEigenSolver. However I'm skeptical about its accuracy since it is squaring the entries that is typically something you want to avoid when using a SVD solver. I guess that in your case you are more interested in the singular values/vectors rather than in solving a least square problem? otherwise QR would offer a better accuracy/perf option.

gael


On Fri, Dec 7, 2012 at 2:31 PM, Mark Borgerding <mark@xxxxxxxxxxxxxx> wrote:
I compared the speed of BasicSVD to two versions of LAPACK running the [scdz]gesdd functions: the netlib version that comes with ubuntu and the MKL (Composer version).
They are pretty comparable.  BasicSVD in "fast mode" beats either version for k<5.  The mode based on SelfAdjointEigenSolver is slower than the MKL LAPACK.

-- Mark


On 12/05/2012 11:05 AM, Mark Borgerding wrote:
Attached is a draft of a faster SVD class.

Summary:
    The BasicSVD is much faster and sufficiently accurate for many (most?) purposes.


excerpt from header:
 * Singular Value Decomposition by eigenanalysis of the Gramian
 *
 * option 1: (default) Solves A=U*S*V' by finding the eigenvectors,values of A'A (or AA' if matrix is wide)
 *           This is done using the SelfAdjointEigenSolver
 * option 2: "fastMode" uses power iteration to quickly strip off the dominant eigenvectors one at a time.
 *              This may be useful when only a few of the singular dimensions are desired (e.g. k<5)
 *
 * Currently limited to "Thin" or "Economy Sized" singular vectors. When decomposing a rectangular matrix,
 * either the U or V matrix will be an incomplete basis (tall). Any completion of this basis is valid and
 * is left as an exercise to the reader.
*
* The "fast" mode uses power iteration (plus feedback)
*    This can be several times faster than the default mode, but
*   with caveats:
*      *) convergence is slow and accuracy suffers in the case of nearly identical strongest singular values
 *     *) speed is linear with K, so if K> 5ish, fastMode may actually be slower.



Output from testBasicSVD.cc (attached)
compiled with: g++ -Wall -O -g -msse4.2 -I /path/to/inc/ testBasicSVD.cc
explanation of fields:
    values: K strongest singular values (in this case K=2) ( all lesser dimensions have singular values around 1)
    vectorwise errors: singular vector errors left|right for each of the K strongest dimensions
    subspace error: residual error of the true rank-K subspace projected onto the found subspace

begin output :
############## Type: float Problem Size: 300x200
================ JacobiSVD ================
elapsed time 0.358115s
values = 3.00011 2.00006
vectorwise errors (dB):-87.5|-92.3 -87.8|-92.2
subspace error -114 dB
================ BasicSVD ================
elapsed time 0.0242s
values = 3 2
vectorwise errors (dB):-133|-125 -112|-106
subspace error -109 dB
================ BasicSVD (fast mode) ================
elapsed time 0.0046s
values = 3 2
vectorwise errors (dB):-89.8|-86.3 -78.5|-81.8
subspace error -97.3 dB
############## Type: complex<double> Problem Size: 300x200
================ JacobiSVD ================
elapsed time 5.91s
values = 3 2
vectorwise errors (dB):-259|-265 -266|-268
subspace error -284 dB
================ BasicSVD ================
elapsed time 0.216s
values = 3 2
vectorwise errors (dB):-241|-237 -231|-231
subspace error -235 dB
================ BasicSVD (fast mode) ================
elapsed time 0.0411s
values = 3 2
vectorwise errors (dB):-305|-306 -178|-175
subspace error -178 dB


-- Mark Borgerding







Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/