Re: [eigen] first benchmark of large and sparse matrix |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] first benchmark of large and sparse matrix*From*: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx>*Date*: Tue, 24 Jun 2008 16:29:06 +0200*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=BpGdoSdz7JBjGva9k9IwA4tCzlH+zFvMCB+1Ryis2Cg=; b=JL8tl0LtsOlavzbwomgj/qPjjkzKmqxwr7nhadF8wZYvtW1Fg5selu/MkR63gvJV4X k0Ef8HKoPYjVpk8kcBPqsf6Xy4BUFsL6h3Y8qbaDfQ9WwQdCQV6oFel1d4qSaq0idvdQ BgLmMN1KvX5uvhltmiovJSD+P2sRVNydPIaVY=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=feLI0aQQfWUqu8VniNBS6lVyxZsfBRAe7tUXXD1+meJ5sR9HXZmq3hnFmtZKbcl28x gQlDlXVWxg2dXR23sxQxG9gXscHOQdrgOJNs7DxH+bCPPvSFDSwcec6HMN2L2C/sLSs4 SacpZGx/qyx4bCs5FkdLNUzzzJUzrdZWu+JYY=

note to the list: I added a dedicated page on the Wiki for the the sparse matrix. What I wrote is only random thoughts and proposals. You're more than welcome to edit them, propose other solutions and let's discuss them here. On Tue, Jun 24, 2008 at 3:53 PM, Benoît Jacob <jacob@xxxxxxxxxxxxxxx> wrote: > I don't know what to say, this is just plainly awesome :) actually I did not realized that the product of two not too sparse matrices can be pretty dense. In that case the dense temporary works the best. On the hand if the result is sparse (e.g., number of nonzeros < 4% for 10Kx10K matrices) looping over the zero of the temporary become really expensive so that you really need something spars. After some tries to exploit the coherence to speedup std::map I gave up I tried with custom linked list which appears to be an order of magnitude faster. Now the best we can do is to estimate the kind of temporary (dense vector or linked list) per column of the result: indeed we can know the number of nonzeros of the corresponding column of rhs (lets call is nnzRC) and if we assume the nonzero of lhs are uniformly spread, then we can estimate the ratio of nonzeros in the result column like this: ratio = nnzRC * ratioLHS where ratioLHS = lhs.nonzeros()/(lhs.cols() * lhs.rows()) > Updating the list: yesterday we vectorized sum(), giving the best > vectorization boost we got so far (for floats, it goes 4x faster, which is > the theoretical maximum). Today I am transposing this code to dot-product and > I expect similar results. actually there is room for some factorization of the code since we now have the same vectorized loops at several places: - normal product (row major * col major) - dot product - redux - sum so sum could call redux, dot product could call (a.conjugate().cwiseProduct(b)).sum(); and the product could call a.row(row).cwiseProduct(b.col(col))).sum(); what do you think ? > > Cheers, > > Benoit > > On Tuesday 24 June 2008 01:06:55 Gael Guennebaud wrote: >> Hi, >> >> some news from the front. First of all I implemented HashMatrix which >> is essentially a column vector of std::map<>. The purpose is to allow >> random writes which at the first place sounds to be needed to >> implement matrix-:matrix products. First result: it is about 4 times >> slower to fill a HashMatrix than filling a SparseMatrix in a coherent >> order. >> >> Then I implemented a <sparse column matrix> * <sparse column matrix> >> product. The algorithm to compute res = lhs * rhs is pretty simple: >> >> std::vector tmp<res.cols()>; >> for (int j=0; j<rhs.cols(); ++j) >> { >> tmp = 0.0; >> // for each coeff of the column j of rhs >> for (Rhs::InnerIterator rhsIt(rhs, j); rhsIt; ++rhsIt) >> { >> // for each coeff of the column rhsIt.index() of lhs >> for (Lhs::InnerIterator lhsIt(lhs, rhsIt.index()); lhsIt; ++lhsIt) >> tmp[lhsIt.index()] += lhsIt.value() * rhsIt.value(); >> } >> // copy skipping the zero entries: >> res.col(j) = tmp; >> } >> >> as you can see it's pretty simple and and the trick is to allocate a >> single dense column to avoid many writes to a sparse array. The memory >> cost of this temporary should not be an issue since the SparseMatrix >> already stores a dense column vector to index the start of each >> column. Moreover this algorithm still fill the result sparse matrix in >> a coherent order meaning that so far there is no real need for a >> HashMatrix with random write access. >> >> The results are pretty good: >> >> 2048^3, 10% of nonzero entries: >> >> Eigen dense = 1.92237 >> Eigen sparse = 0.399866 >> Eigen hash = 1.30217 >> GMM++ sparse = 15.973 >> >> >> 2048^3, 1% of nonzero entries: >> >> Eigen dense = 1.86719 >> Eigen sparse = 0.0492229 >> Eigen hash = 0.186288 >> GMM++ sparse = 0.24993 >> >> FYI, the same algorihm without the dense temporary and using a >> HashMatrix for the result is much much slower (~ one order of >> magnitude). >> >> cheers, >> gael. >> > > >

**Follow-Ups**:**Re: [eigen] first benchmark of large and sparse matrix***From:*Benoît Jacob

**References**:**Re: [eigen] first benchmark of large and sparse matrix***From:*Benoît Jacob

**Re: [eigen] first benchmark of large and sparse matrix***From:*Gael Guennebaud

**Re: [eigen] first benchmark of large and sparse matrix***From:*Gael Guennebaud

**Re: [eigen] first benchmark of large and sparse matrix***From:*Benoît Jacob

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] first benchmark of large and sparse matrix** - Next by Date:
**Re: [eigen] first benchmark of large and sparse matrix** - Previous by thread:
**Re: [eigen] first benchmark of large and sparse matrix** - Next by thread:
**Re: [eigen] first benchmark of large and sparse matrix**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |