Re: [eigen] sse asin implementation

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] sse asin implementation
From: Rohit Garg <rpg.314@xxxxxxxxx>
Date: Wed, 1 Apr 2009 20:34:44 +0530
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=+Z1AVr/hfsz1BLu41rv8ywOKfPQ/LLMqG2xGmH+C77E=; b=Q6N2OTVaCIFxR8B2yeqdNeghGj9N7vVhDbfw91ATFOc9Zsvl6QZHKkq7OPN1FDpvPq dhzpUSJZAmzGzHlUZtP+zb+8mCIKKx45pqToKUzQQdR3vOpyujz9NXYk2IKQpkQKDl14 fgLWRT4cUV4kNLdsqEVOrEy8c9KVnTwFHeh38=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=JDpH1/mdWWPsCfD20Iac7Hyr1uuqD0jhoqaF4IgW7V1FTQqikauPq0ZTO0eXFNMJBM eCj9DTwKbEtm9R43MkXY3dqISkDvJh8Yc2y9okfNsXat8GS3EGfSCYUYfdkoEgisnvdB BhYIX9Xjxju76+YJU7NTJ5RYV8rHj7laQ3HtM=

It's about 2x faster which is expected. sse means a 4 asins per call
but we compute both branches for all vectors so it's effectively half
of that.

On Wed, Apr 1, 2009 at 7:56 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> I forgot: do you have performance comparisons for your vectorized
> version of asin ?
>
> On Wed, Apr 1, 2009 at 4:25 PM, Gael Guennebaud
> <gael.guennebaud@xxxxxxxxx> wrote:
>> hi,
>>
>> let me remind that currently the packet versions of sin, cos, exp, log
>> and sqrt are enabled by default (regardless of the fast-math option).
>> The vectorized version of sin, cos and sqrt can be disabled by
>> defining a preprocessor token. If there are good argument I'm still ok
>> to change this behavior for "disabled by default" and "enabled if
>> either EIGEN_FAST_MATH or _FAST_MATH_ are defined".
>>
>> about the vectorization of asin, acos, etc, indeed, I don't see many
>> use cases for a "vec.cwise().asin()" , but I think they can be useful
>> to vectorize (by hand) more complex algorithms. Perhaps, a good
>> compromise would be to put them in an "ExtraVectorization" module in
>> unsupported/ and move some of them to the official Array module
>> according to the respective demands.
>>
>> gael.
>>
>> On Tue, Mar 31, 2009 at 6:57 PM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
>>> On Tue, Mar 31, 2009 at 10:19 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>> 2009/3/31 Rohit Garg <rpg.314@xxxxxxxxx>:
>>>>> For general exponentiation, I think a cheap and easy route would be
>>>>> just exp(b*log(a))
>>>>>
>>>>> I looked at the cephes library implementation of pow and they do a lot
>>>>> of hacks just to get 3 bits of precision. In the use case that you
>>>>> have cited, I think it's better to just have the above implementation.
>>>>> It is certainly good enough for the -ffast-math case.
>>>>
>>>> OK, seems sensible. But I thought the main justification for a simd
>>>> pow() would be performance, that it could be faster than the above
>>>> formula. If that's not the case then sure, I agree with you.
>>>
>>> ATM, I don't know of any way to do that without going the log ->exp
>>> route. cephes does something without taking a full blown log and an
>>> exp but the general approach is same. But yes, we'll have to flush the
>>> denormals if we want any real performance benefit. Perhaps this is a
>>> candidate to be vectorized only if the fast-math is chosen. And
>>> looking at the log code already in eigen, ATM the user has no way to
>>> select the default route if he wants the denormals to be treated with
>>> respect. Silently killing them is suitable only if the -ffast-math is
>>> given
>>>
>>>>
>>>>>
>>>>> BTW, what do you think of enabling the fast math paths in eigen when
>>>>> just -ffast-math is supplied to gcc. It can be detected by the
>>>>> __FAST_MATH__ macro.
>>>>
>>>> That seems like a good idea! Let's see what Gael thinks.
>>>>
>>>> Cheers,
>>>> Benoit
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Rohit Garg
>>>
>>> http://rpg-314.blogspot.com/
>>>
>>> Senior Undergraduate
>>> Department of Physics
>>> Indian Institute of Technology
>>> Bombay
>>>
>>>
>>>
>>
>
>
>



-- 
Rohit Garg

http://rpg-314.blogspot.com/

Senior Undergraduate
Department of Physics
Indian Institute of Technology
Bombay

References:
- Re: [eigen] sse asin implementation
  - From: Gael Guennebaud
- Re: [eigen] sse asin implementation
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] sse asin implementation
Next by Date: Re: [eigen] Overflow in sum()
Previous by thread: Re: [eigen] sse asin implementation
Next by thread: Re: [eigen] sse asin implementation

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/