Re: [eigen] Efficiently creating a matrix of pairwise vector differences.

[ Thread Index | Date Index | More Archives ]

On Wed, Dec 20, 2017 at 3:44 PM Smith, Louis <Louis_Smith@xxxxxxxxxxxxxxxxxx> wrote:


I'm trying to use eigen to compute the distances between m length vectors i and j which are each rows in an NxM matrix (note that M is often much much smaller than N. In my test case N is about 250,000 and M is 6). What I'm currently working with is an _expression_ like:

Matrix Xd data = "" //This works when data is written to cout, so elided.

MatrixXd distances = (data.rowwise() - data.transpose().colwise().transpose()).norm();

Which gives me the following error:

error: no member named 'transpose' in 'Eigen::VectorwiseOp<Eigen::Transpose<Eigen::Matrix<double, -1, -1, 0, -1, -1> >, 0>'
  MatrixXd distances = (data.rowwise() - data.transpose().colwise().transpose()).norm();

When I get rid of the third transpose I'm then subtracting a column vector from a row vector, which also doesn't work. Calling transpose on the row vector gives a similar error.

What I expect distances to be is an NxN symmetric matrix containing the distances (norms of difference vectors) for each pair of row vectors in data.

Sorry for the newbie question, but I'd really appreciate some insight on this since it seems like there should be a more eigeny way to write this than the double-for loop over the data, which also works but is very slow.

If you’re writing a 250000x250000 matrix, I don’t know how fast you think it should be. If you did a GEMM between 250000x6 and 6x250000, it is likely limited by write bandwidth, because the inner loop is 1-6 cycles on modern hardware. You can compare GEMM to triple looks for these dimension to verify.

I suspect double loops is actually going to win, especially if you convince the compiler to generate nontemporal stores (assuming x86) to minimize RFO traffic..

Jeff Hammond

Mail converted by MHonArc 2.6.19+