[eigen] OpenMP implementation of Matrix*Vector operation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

Hi everyone:

        I'm trying to use eigen for solving some linear algebra equations
iteratively, so the Matrix*Vector operation is quite common . As I know ,
in eigen, the openMP parallelization is only implemented for matrix*matrix
 multiplication( tell me if i'm wrong).

        However, in my case, the matrix is often with a moderate large scale
( typically several thousand or even hundreds of thousand for sparse).
So it's quite
necessary to take the advantage of multicore cpu.

        I've heard it's on the schedule, so how is it going now? Any
benchmark result
 with respect to matrix scale?


       ps: Actually I've implemented a simple version by partitioning
the matrix to
several "blocks", but it turns out to work well only for scale
100~2000, and it's
much slower for larger scale (ie, no better than the serial code).

Here is my code snippet:
(built by g++ with flag -O2 -fopenmp -march=native)

int N = 1000; //Scale
Matrix<double,Dynamic,Dynamic,rowMajor> m =
VectorXd v = VectorXd::Random(N);
VectorXd s(N);

#pragma omp parallel for
for( int i = 0; i < omp_get_num_threads(); i += 1)
         int nthreads = omp_get_num_threads();
         int rank = omp_get_thread_num();
         int chunk = (N + nthreads -1)/nthreads;
         int i0 = rank * chunk;
         int i1 = (rank+1)*chunk<N ? (rank+1)*chunk : N;
         int in = i1 - i0;

         s.segment(i0,in) = m.block(i0,0,in,N)*v;

Is there a better way to do this?
thanks very much!

Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/