Re: [eigen] Help on solving a race condition

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Ok, after doing some benchmark, a critical section is indeed a no go:
x10 slowdown.

On the other hand, it seems that the simple solution of making them
thread-private:

#pragma omp threadprivate(m_l1CacheSize,m_l2CacheSize)

has nearly no impact on the performance. Compared to the cost of
creating a thread, querying the cache size is a no-op. So I will
probably go with that simple solution.

thanks for all the suggestions!

gael.

On Fri, Jun 8, 2012 at 4:57 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> On Fri, Jun 8, 2012 at 4:35 PM, Hauke Heibel
> <hauke.heibel@xxxxxxxxxxxxxx> wrote:
>
>> #pragma omp critical
>> {
>>  static std::ptrdiff_t m_l1CacheSize =
>> manage_caching_sizes_helper(queryL1CacheSize(),8 * 1024);
>>  static std::ptrdiff_t m_l2CacheSize =
>> manage_caching_sizes_helper(queryTopLevelCacheSize(),1*1024*1024);
>> }
>
> yes that would work, but mutex introduces a too large overhead. This
> function is called for every matrix product!
>
> It seems OpenMP's support for atomics is not very good. I managed to
> make helgrind happy with the following:
>
>  static tbb::atomic<std::ptrdiff_t> m_l1CacheSize;
>  static tbb::atomic<std::ptrdiff_t> m_l2CacheSize;
>  if(!m_l1CacheSize)
>  {
>    std::ptrdiff_t l1 =
> manage_caching_sizes_helper(queryL1CacheSize(),8 * 1024);
>    m_l1CacheSize.fetch_and_store(l1);
>  }
>  if(!m_l2CacheSize)
>  {
>    std::ptrdiff_t l2 =
> manage_caching_sizes_helper(queryTopLevelCacheSize(),1*1024 * 1024);
>    m_l2CacheSize.fetch_and_store(l2);
>  }
>
> but 1) this requires Intel's TBB, 2) I still have to measure the
> overhead of this solution.
>
>
> best,
> Gael.
>
> gael.



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/