Re: [eigen] first benchmark of large and sparse matrix |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] first benchmark of large and sparse matrix*From*: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx>*Date*: Tue, 24 Jun 2008 01:06:55 +0200*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=VNQJHkeev77yq8/Ji1ckxmUx0Iw/YO7rblhDoJZFXxg=; b=ITRv7V82l1OLVZhGDEuplz9hTwO0h94GhfR7uvMuOI/k6StTF/gwK7m9U0qjwUQFM7 +rO0rZTfmzgEjCLuBJILDhSajD0g8Wo/1GPcqNseUtL8EtLxc9FeAoXzLqopZJ2533Hj 21CF6aviSH0Y10o62EZSOKzgQTpzc7FZ7KI1Q=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=lquZLzT67fXy7T7CWvBKxB3azB5e1E7pkbeEbSYOMjepV22d8MgyxSde+WtleFmL0d 4vvAyV/jdX+WLOVXG93IvEuD9u2K3qWNuHzGdwfED7j/aHFFgtB2DGGi91qpGk5GU50O 3+VUe+WR4ZPfYokA7OSdzv/o9mkFF4cGa4lko=

Hi, some news from the front. First of all I implemented HashMatrix which is essentially a column vector of std::map<>. The purpose is to allow random writes which at the first place sounds to be needed to implement matrix-:matrix products. First result: it is about 4 times slower to fill a HashMatrix than filling a SparseMatrix in a coherent order. Then I implemented a <sparse column matrix> * <sparse column matrix> product. The algorithm to compute res = lhs * rhs is pretty simple: std::vector tmp<res.cols()>; for (int j=0; j<rhs.cols(); ++j) { tmp = 0.0; // for each coeff of the column j of rhs for (Rhs::InnerIterator rhsIt(rhs, j); rhsIt; ++rhsIt) { // for each coeff of the column rhsIt.index() of lhs for (Lhs::InnerIterator lhsIt(lhs, rhsIt.index()); lhsIt; ++lhsIt) tmp[lhsIt.index()] += lhsIt.value() * rhsIt.value(); } // copy skipping the zero entries: res.col(j) = tmp; } as you can see it's pretty simple and and the trick is to allocate a single dense column to avoid many writes to a sparse array. The memory cost of this temporary should not be an issue since the SparseMatrix already stores a dense column vector to index the start of each column. Moreover this algorithm still fill the result sparse matrix in a coherent order meaning that so far there is no real need for a HashMatrix with random write access. The results are pretty good: 2048^3, 10% of nonzero entries: Eigen dense = 1.92237 Eigen sparse = 0.399866 Eigen hash = 1.30217 GMM++ sparse = 15.973 2048^3, 1% of nonzero entries: Eigen dense = 1.86719 Eigen sparse = 0.0492229 Eigen hash = 0.186288 GMM++ sparse = 0.24993 FYI, the same algorihm without the dense temporary and using a HashMatrix for the result is much much slower (~ one order of magnitude). cheers, gael. On Mon, Jun 23, 2008 at 9:54 PM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote: > On Mon, Jun 23, 2008 at 7:38 PM, Schleimer, Ben <bensch128@xxxxxxxxx> wrote: >> Hi guys, >> I didn't know gmm++ matrices were so slow either. > > well, so far I've only tested the sum which is not a critical > operation. I guess they mainly focused on the optimization of the > matrix product and triangular solver which are the main bricks to > build linear solvers. > >> I wonder if we could take the gmm++ algos and use templated eigen sparse matrices with them... > > that was more or less the plan, but I'm not sure anymore because of > licensing issues.... > >> >> btw, how are you designing sparse templated matrices? >> Do you have any papers to point to? > > well, nothing is fixed yet ! you might find some info on the wiki. So > far, I found that the most common storage scheme is the "Compressed > Column Storage" strategy: > http://www.netlib.org/linalg/html_templates/node90.html > so I started to play with it such that we are able to use optimized > backends for the high level algorithms. To manipulate such matrices, > the idea is to replace the for loops (random access) by Iterators: we > can specialize such Iterators for each type of expression, and we can > even nest the Iterators as we nest the expressions. > CwiseBinaryOp::InnerIterator is a good example of what I mean (in > src/Sparse/CoreIterators.h). > > One of the main challenge with such a storage is to dynamically set a > matrix. GMM++ does not allow any write operation, so you have to use > another kind of sparse matrix, i.e., a row of std::map and eventually > you do a copy to build your final and efficient sparse matrix. MTL4 > uses a similar principle: they preallocate 5 entries (or more if you > want more) per column and they manage overflows using a std::map. > Currently I implemented something different which allows to directly > fill such a matrix under the constraint that the coefficients must be > set by increasing indices. I guess this is enough in most cases but we > definitely need something more flexible, at least to be able to > implement any kind of matrix product. > > of course any idea and suggestion are more than welcome ! > > cheers, > gael > > >> >> Cheers >> Ben >> >> >> --- On Sun, 6/22/08, Benoît Jacob <jacob@xxxxxxxxxxxxxxx> wrote: >> >>> From: Benoît Jacob <jacob@xxxxxxxxxxxxxxx> >>> Subject: Re: [eigen] first benchmark of large and sparse matrix >>> To: eigen@xxxxxxxxxxxxxxxxxxx >>> Cc: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx> >>> Date: Sunday, June 22, 2008, 11:05 PM >>> Man, that is good...!!! >>> >>> It's normal that the sparse matrix is a bit slower than >>> the dense one with 10% >>> nonzero coeffs; also I suppose that the memory usage is >>> already significantly >>> lower. So your numbers look very good! >>> >>> I wasn't expecting GMM++ to perform so poorly; it >>> sounds to me as if their >>> sparse matrices are not intented for doing matrix >>> arithmetic, but instead are >>> just intended to apply solving algorithms; still, being >>> able to form sums is >>> at least useful to construct the matrices in the first >>> place! (And I >>> definitely personnally see use cases for full support for >>> arithmetic on >>> sparse matrices, as this models a mathematical object known >>> as a 'convolution >>> algebra' and is thus very interesting). >>> >>> So it is sounding like you're solving the problem of >>> sparse matrix storage; if >>> there exists a possibility of providing an abstraction >>> layer for sparse >>> algorithms supporting multiple backends (as discussed on >>> IRC, implementing >>> sparse algos ourselves is really too ambitious, so we want >>> to reuse existing >>> code for that) then we would get a very, very good >>> Eigen-Sparse module (or >>> auxiliary library, depending on how you see it). >>> >>> FYI this is of high interest for Krita and for Step and >>> probably has countless >>> interesting applications; a while ago Benjamin did play >>> with some research >>> papers on vector graphics using sparse matrices to >>> interpolate intelligently >>> between vector shapes, which was very interesting (still >>> have the app he >>> coded using gmm++ and eigen1, can send it to you)... so no >>> need to convince >>> me about the usefulness of all this. >>> >>> Cheers, >>> Benoit >>> >>> P.S. I'm adding vectorization to Redux >>> >>> On Monday 23 June 2008 02:09:15 Gael Guennebaud wrote: >>> > Hi, >>> > >>> > this WE I started looking at sparse matrices and I >>> started to write >>> > some experimental piece of code to see how they could >>> be integrated >>> > into the current base code. The current sparse matrix >>> class implement >>> > the "Compressed Column Storage" (CCS) scheme >>> and does not require any >>> > change in Core: an InnerIterator template class is >>> specialized for >>> > each matrix and expression type. These iterators allow >>> to efficiently >>> > skip the zero coefficients of a row while being >>> compatible with the >>> > expression template mechanism. >>> > >>> > To check whether it was worth it, I benchmarked the >>> sum of 3 matrices >>> > of 4000x4000 with 10% of non zero coefficients >>> (random) : >>> > >>> > for (int k=0; k<10; ++k) >>> > m4 = m1 + m2 + m3; >>> > >>> > the results: >>> > >>> > * dense matrix (with vectorization and all eigen >>> optimizations) : 0.7sec >>> > >>> > * dense matrix without xpr template, i.e.: >>> > m4 = (m1 + m2).eval() + m3 >>> > => 1.6 sec >>> > >>> > * experimental sparse matrix with generic xpr >>> template: 1.6 sec >>> > >>> > * experimental sparse matrix without xpr template (but >>> with a >>> > specialized operator* for the sum): 2.1 sec >>> > >>> > * GMM++: 19sec !! (+ very high memory consumption) >>> > >>> > so yeah it was worth it ! Even with sparse matrix the >>> ET is pretty >>> > useful. Moreover the current implementation is far to >>> be optimized >>> > (e.g. it uses std::vector<>) so we can expect >>> even better performance. >>> > >>> > I believe the poor perf of GMM++ is partly due to the >>> use of "mapped" >>> > sparse matrix (one std::map per row) for the result >>> because their >>> > implementation of CCS matrix is read-only. >>> > >>> > cheers, >>> > gael. >> >> >> >

**Follow-Ups**:**Re: [eigen] first benchmark of large and sparse matrix***From:*Benoît Jacob

**References**:**Re: [eigen] first benchmark of large and sparse matrix***From:*Benoît Jacob

**Re: [eigen] first benchmark of large and sparse matrix***From:*Schleimer, Ben

**Re: [eigen] first benchmark of large and sparse matrix***From:*Gael Guennebaud

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] first benchmark of large and sparse matrix** - Next by Date:
**Re: [eigen] first benchmark of large and sparse matrix** - Previous by thread:
**Re: [eigen] first benchmark of large and sparse matrix** - Next by thread:
**Re: [eigen] first benchmark of large and sparse matrix**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |