|Re: [eigen] Help on solving a race condition|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Help on solving a race condition
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Mon, 11 Jun 2012 19:17:49 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=NbRtIfe6vLxXxTUxKvMCgHW+ueKFUlymtzJITYHvFxU=; b=iMi5O/F7VrkG2cw22DC82FtgQVRsEFzUihd+MM+6xc3W0jAnta2U19q7ZxlbQcn221 kSSvr88KhVq7nLTBNMiMTAywEFvyFOl1Up01Cukp/vp9uPUwP/ueK05JwmMFI2UtYtJZ 8Nethpi+hlVRFGxP7DrbsF/Qm6OgAbDQLWDqAnMdH0kDA2pW5OOQTTkX6XsqUU6/l3CF WMoYs2ei0FRHcLOpv8cnuW6CKIHGTZfycTcLCtuzeuvxQ068h2/Iy5KSuG3MJV+y+hjK smLjLjKN8ysrap6InVEnQGmT05WRykdgz9CxhhtEv8YGNKI7v9qPS+kYKHy6qyIaqTB6 MgIg==
After some more thoughts, I'm finally unsure about the thread local
storage solution. Indeed, this solution seems to be quite fragile
because it requires compiler/platform specific code.
So I propose to simply rewrite the code like this:
static std::ptrdiff_t m_l1CacheSize = 0;
static std::ptrdiff_t m_l2CacheSize = 0;
static bool initialized = false;
m_l1CacheSize = manage_caching_sizes_helper(queryL1CacheSize(),8 * 1024);
initialized = true;
With this solution it might happen that m_l1CacheSize is overwritten
at the same time it is read by another thread, but since it gets
overwritten by the same value this cannot be an issue, right?
On Fri, Jun 8, 2012 at 8:54 PM, Gael Guennebaud
> On Fri, Jun 8, 2012 at 6:13 PM, Rhys Ulerich <rhys.ulerich@xxxxxxxxx> wrote:
>>> Then I guess the only safe and clean solution is to request users to
>>> call a Eigen::init_parallel() or something.
>> That's safe but I'd not call it clean. It's error prone. And verbose.
>>> The problem with this approach, is that the cache-sizes are recomputed
>>>for every thread.
>> That doesn't seem terrible to me if I don't have to explicitly call
>> Eigen::init_parallel() as a tradeoff. Is this a slow process?
> as I said in a previous email, after some benchmarking, this approach
> is not that slow: compared to the thread creation cost it's a no-op.
>>>> This solution is openmp specific; what if someone wants to use pthreads or
>>>> some other threading system ?
>> Why not #ifdef on compiler versions to dig out the vendor-specific
>> thread-local storage extension when OpenMP isn't active:
>> #if defined(_OPENMP)
>> # define TLS
>> #elseif defined(__GNUC__) && __GNUC__ > 3
>> # define TLS __thread
>> # define TLS ...
>> # define TLS
>> static TLS std::ptrdiff_t m_l1CacheSize = 0;
>> static TLS std::ptrdiff_t m_l2CacheSize = 0;
>> #pragma omp threadprivate(m_l1CacheSize, m_l2CacheSize)
>> #ifdef TLS
>> #undef TLS
>> Notice that no vendor-specific threadlocal keyword is used when OMP is
>> active because OMP is the better way to go.
> ah great, I did not know there existed such compiler extensions.
> That's a lead to pursue.
>> It's a bit ugly from the implementer's side, but it the number of
>> vendor-specific thread-local keywords should be small. From the end
>> user's perspective, however, it is seamless.
>> - Rhys