Re: [eigen] a branch for SMP (openmp) experimentations

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]




On Fri, Feb 26, 2010 at 10:44 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:
There is also something very strange: if I change the code so that all threads pack the exactly same B_k to the same shared B' and keep the barrier, then I still don't get a correct result... (if each thread have there own B', then it's fine)

arf, I'm too much used to GPU computing where all threads of a wrap follows the same execution path. Here I realized that even though all threads have to do exactly the same amount of work they can be totally de-synchronized: the barrier occurs with different horizontal panel Bk of B ! To be more precise, the outermost loop looks like this:

for(k=0;k<nb_k;++k)
{
   pack_b(k);
  
   #pragma omp barrier

   // here some threads have k=0 while others have k=1....
}

I guess that means that packing b is faster than creating a thread, and so the first barrier occurs before all threads have been launched ! So we really have to take care at how we synchronize the threads.

gael


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/