|Re: [eigen] Non-optimal sse assembly code with gcc|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
DAZ and FTZ makes a big speedup on intel machines, since handling
denormals is really slow (and often only propagates rounding errors).
Enabling these modes is something Eigen users can do by themselves, but
it might be worth a mention in some documentation somewhere.
_mm_setcsr( _mm_getcsr() | (1<<15)); // FTZ
_mm_setcsr( _mm_getcsr() | (1<<6)); // DAZ
On 01/23/2012 07:19 AM, Christoph Hertzberg wrote:
On 23.01.2012 12:54, Benjamin Schindler wrote:
I would love to know whether your rationale on denormalized numbers is
correct though. Does anybody else know exactly?
I found this article (it is 3 years old, though):