Hi everyone: I'm trying to use eigen for solving some linear algebra equations iteratively, so the Matrix*Vector operation is quite common . As I know , in eigen, the openMP parallelization is only implemented for matrix*matrix multiplication( tell me if i'm wrong). However, in my case, the matrix is often with a moderate large scale ( typically several thousand or even hundreds of thousand for sparse). So it's quite necessary to take the advantage of multicore cpu. I've heard it's on the schedule, so how is it going now? Any benchmark result with respect to matrix scale? Thanks~ ps: Actually I've implemented a simple version by partitioning the matrix to several "blocks", but it turns out to work well only for scale 100~2000, and it's much slower for larger scale (ie, no better than the serial code). Here is my code snippet: (built by g++ with flag -O2 -fopenmp -march=native) int N = 1000; //Scale Matrix<double,Dynamic,Dynamic,rowMajor> m = Matrix<double,Dynamic,Dynamic>::Random(N,N); VectorXd v = VectorXd::Random(N); VectorXd s(N); #pragma omp parallel for for( int i = 0; i < omp_get_num_threads(); i += 1) { int nthreads = omp_get_num_threads(); int rank = omp_get_thread_num(); int chunk = (N + nthreads -1)/nthreads; int i0 = rank * chunk; int i1 = (rank+1)*chunk<N ? (rank+1)*chunk : N; int in = i1 - i0; s.segment(i0,in) = m.block(i0,0,in,N)*v; } Is there a better way to do this? thanks very much!

