Re: [eigen] A not so simple product
• To: eigen@xxxxxxxxxxxxxxxxxxx
• Subject: Re: [eigen] A not so simple product
• From: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx>
• Date: Tue, 9 Dec 2008 15:06:15 +0100

Hi,

I guess there is no M and you wanted to write "VectorXf w(N);" instead of "VectorXf w(M);" ?

Anyway, you can write it as a single _expression_ like that:

m = ( (rho.cwise()*w).cwise().inverse().asDiagonal() ) * m;

the matrix-matrix product will be optimized since Eigen knows that the left-hand side (lhs) matrix is a diagonal matrix. Unfortunately, Eigen does not take into account yet the fact that the diagonal of lhs is actually stored as a vector, and therefore this _expression_ won't be vectorized.

The good think is that this is an optimization we can quite easily add. So I would recommand to do so and wait a bit that such expressions get vectorized by Eigen.

Note that in the meantime, if performance matters you can also do:

rho (or w, or a temporary) = rho.cwise()*w).cwise().inverse();
for (int k=0: k<3; ++k)
m.col(i).cwise() *= rho (or w, or a temporary);

in which case all computations will be vectorized at the cost of additionnal read/writes operations.

Below are some proposals related to that problem:

Another option would be to add a mechanism to extend a vector to a matrix with repetitions:

m = m .cwise()* ( (rho.cwise()*w).cwise().inverse().duplicate(3) );

I'm not sure about "duplicate" for the name but I know matlab offers such a feature. For the vectorization this is a bit tricky because such a "Duplicate" _expression_ can be vectorized in both direction: you can either return column packets of row packets (by repeating a scalar in each component of the packet) to accomodate the storage of m. However, this fine tunning of the packet generation is something we cannot support without big changes in Eigen (or ugly workarounds).

Finally, a third solution to implement this kind of thing could be to extend the rowwise/colwise API:

m = m.colwise() * (rho.cwise()*w).cwise().inverse();

perhaps this last proposal is a bit confusing.

In any case the choice for the traversal of m is a bit tricky. Assuming m is column major, we can process it column per column that is good wrt to the cache of m but which requires a temporary to store "rho.cwise()*w).cwise().inverse()", or we process m per block of packet-size rows...

cheers,
Gael.

On Tue, Dec 9, 2008 at 2:10 PM, Benjamin Schindler wrote:
Hi

Okay, I'm trying to figure the most efficient way to vectorize this product:

MatrixXf m(N,3);
VectorXf rho(N);
VectorXf w(M);

What I would like to do: The i'th row of m is suppposed to get multiplied by 1 / (rho(i)*w(i))

I've not yet found a way which actually works as a short eigen-_expression_... is there any or do I have to go back to the old rusty for loop? :)

Thanks
Benjamin

---

 Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/