Re: [eigen] checking for NaN |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
Benoit,
You're correct of course, but the articles that I sent explain how the default x == x test is slower than manually
crafted SSE optimizations. Is the suggestion you had regarding the eigen array operations faster than doing 4 individual
checks? Is array1 == array2 nothing but for (int i = 0; i < 3; ++i) array1[i] == array2[i], or are you comparing the
__float128 values directly, or is there a special SSE optimization going on under the hood? The article
http://locklessinc.com/articles/classifying_floats/ at talks about comparing __float128 is an order of magnitude slower.
Thanks,
Radu.
On 01/05/2012 06:38 PM, Benoit Jacob wrote:
> 2012/1/5 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> Notice that isfinite can be implemented simply in terms of plain arithmetic:
>>
>> bool isnotnan(float x) { return x == x };
>>
>> bool isfinite(float x) { return isnotnan(x-x); }
>>
>> So you can implement that using Eigen array operations. I don't
>> remember if we already have a utility function for that.
>>
>> template<typename Derived>
>> bool isnotnan(const ArrayBase<Derived>& x)
>> {
>> return x == x;
>> }
>>
>> template<typename Derived>
>> bool isfinite(const ArrayBase<Derived>& x)
>> {
>> return isnotnan(x - x);
>> }
>
> And to get the expression you mentioned in your email (a single bool,
> true if all entries are finite), you just have to call .all() on the
> result.
>
> Benoit
>
>>
>> Cheers,
>> Benoit
>>
>> 2012/1/5 Radu B. Rusu <rusu@xxxxxxxxxxxxxxxx>:
>>> I was wondering if there's a need for optimized versions for isNaN, isFinite, etc. I caught Benoit talking about it
>>> earlier last year (http://forum.kde.org/viewtopic.php?f=74&t=96166), and did a bit of searching online to find
>>> discussions here: http://locklessinc.com/articles/classifying_floats/ and here:
>>> https://bugzilla.mozilla.org/show_bug.cgi?id=416287
>>>
>>> Due to the nature of our data, we end up calling isNaN extensively on large datasets, and we end up running things like:
>>>
>>> return (pcl_isfinite (p.x) && pcl_isfinite (p.y) && pcl_isfinite (p.z));
>>>
>>>
>>> The article from locklessinc seemed interesting with respect to an optimized SSE version for Vector4f.
>>>
>>> Thanks,
>>> Radu.
>>>
>>>
>>>
>
>