|Re: [eigen] Eigen SSE denorm modes?|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Eigen SSE denorm modes?
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Sat, 10 Mar 2012 16:42:16 +0100
- Cc: Mark Borgerding <mark@xxxxxxxxxxxxxx>
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=x9xlPZ4B/T2j+h13OQc7elJ5oR9I6v2+Zp4y7CO36UM=; b=t1lhMSOJsHxDaSkj6xGbCIgXJ1ZNJu3UmAYzawbhJe1ED2OTEHOvB3GgW+R8Scxi23 Wp3HxjX+l+ertQjf6i9jOIyUn49e2TG2UFifwpTd+13PTECY/CwyYTNfs0N9LQ5IZAX8 K/KmYwwbJYc78GMrvndhReSV1hUGNoCY57dKv6o60c9cs8fXyquGI9ACzvST+IGRbbiZ HkK+VqxJ/ZilzChRugZ8onChax1kw+7dx7jKjCPwxQlDUmespEEbC+QotWEsVvg26hH3 8JdvIb181111U3N2jmdad4BwW64844DBjBHhvEZMRZHBwCJfELd9d8+UQyvQMIsRrkSl mFtw==
These settings are not persistent but bound to a process.
Regarding the addition of a cross platform function to enable/disable
denorms, I'd say yes. On the same veine, crossplatform function to
catch floating-point exceptions would be nice too.
On Fri, Mar 9, 2012 at 9:08 PM, Dick Lyon <dicklyon@xxxxxxxxxx> wrote:
> Thanks, Mark, for the info on the SSE control intrinsics. I presume these
> are Intel-only; does Eigen has a way to wrap them that doesnt' make the code
> architecture dependent?
> Do these settings have persistent effect? On the processor? or the
> process? Apple seems to say that you should restore the old control word
> after doing your thing, but it's not clear to me how this is scoped:
> On Fri, Mar 9, 2012 at 11:37 AM, Mark Borgerding <mark@xxxxxxxxxxxxxx>
>> On 03/08/2012 04:22 PM, Dick Lyon wrote:
>>> Is there a way to control how Eigen deals with denormalized floats?
>>> There are low-level SSE registers to control flushing to zero and such,
>>> but does anyone who how to control them, or what the defaults are?
>> DAZ and FTZ makes a big speedup on intel machines, since handling
>> denormals is really slow (and often only propagates rounding errors).
>> You can enabling these modes yourself:
>> _mm_setcsr( _mm_getcsr() | (1<<15)); // FTZ
>> _mm_setcsr( _mm_getcsr() | (1<<6)); // DAZ