Re: [eigen] sse asin implementation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


I forgot: do you have performance comparisons for your vectorized
version of asin ?

On Wed, Apr 1, 2009 at 4:25 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> hi,
>
> let me remind that currently the packet versions of sin, cos, exp, log
> and sqrt are enabled by default (regardless of the fast-math option).
> The vectorized version of sin, cos and sqrt can be disabled by
> defining a preprocessor token. If there are good argument I'm still ok
> to change this behavior for "disabled by default" and "enabled if
> either EIGEN_FAST_MATH or _FAST_MATH_ are defined".
>
> about the vectorization of asin, acos, etc, indeed, I don't see many
> use cases for a "vec.cwise().asin()" , but I think they can be useful
> to vectorize (by hand) more complex algorithms. Perhaps, a good
> compromise would be to put them in an "ExtraVectorization" module in
> unsupported/ and move some of them to the official Array module
> according to the respective demands.
>
> gael.
>
> On Tue, Mar 31, 2009 at 6:57 PM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
>> On Tue, Mar 31, 2009 at 10:19 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>> 2009/3/31 Rohit Garg <rpg.314@xxxxxxxxx>:
>>>> For general exponentiation, I think a cheap and easy route would be
>>>> just exp(b*log(a))
>>>>
>>>> I looked at the cephes library implementation of pow and they do a lot
>>>> of hacks just to get 3 bits of precision. In the use case that you
>>>> have cited, I think it's better to just have the above implementation.
>>>> It is certainly good enough for the -ffast-math case.
>>>
>>> OK, seems sensible. But I thought the main justification for a simd
>>> pow() would be performance, that it could be faster than the above
>>> formula. If that's not the case then sure, I agree with you.
>>
>> ATM, I don't know of any way to do that without going the log ->exp
>> route. cephes does something without taking a full blown log and an
>> exp but the general approach is same. But yes, we'll have to flush the
>> denormals if we want any real performance benefit. Perhaps this is a
>> candidate to be vectorized only if the fast-math is chosen. And
>> looking at the log code already in eigen, ATM the user has no way to
>> select the default route if he wants the denormals to be treated with
>> respect. Silently killing them is suitable only if the -ffast-math is
>> given
>>
>>>
>>>>
>>>> BTW, what do you think of enabling the fast math paths in eigen when
>>>> just -ffast-math is supplied to gcc. It can be detected by the
>>>> __FAST_MATH__ macro.
>>>
>>> That seems like a good idea! Let's see what Gael thinks.
>>>
>>> Cheers,
>>> Benoit
>>>
>>>
>>>
>>
>>
>>
>> --
>> Rohit Garg
>>
>> http://rpg-314.blogspot.com/
>>
>> Senior Undergraduate
>> Department of Physics
>> Indian Institute of Technology
>> Bombay
>>
>>
>>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/