Re: [eigen] log10 support? |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] log10 support?
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Tue, 8 Jun 2010 07:27:48 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=eiuIwHdaDgTGvxxpH+75OtNFqC70fti/Ai6x8vop+K8=; b=pz1kBP/6XFtJQcmUarromPox14m44RBbdXxvMbV4Z1M1JKa1NcD1S2tCEQnlHcd0Y2 qwnzgxwrQ4LHHJlnnk0xzgKhb1AmH73WxYPVCLLCuIgq6nb17GdMjAh6YoY+PdgXn7mo 7YXlZwFOdU/B/eH6AaPFRl6/GBb91ASF3C/yQ=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=XMHXHb2Ssu2BKfGSXgJci0+SgA2lSvSgRBe3OtQqMYBryP1+VjJzC/yIWy56dWdUx8 Go1NvL+921lAi6Y4z1af8xgeiFfjSF/K+BHJumQyuoKXjpwif+TetZlkqfSF4PaIlXDT JNUSZoO94B/rAoewwRlA/HkLzzEa9Qymhvcr4=
oh yes and of course, + Gael's remark.
Benoit
2010/6/8 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> 2010/6/8 Trevor Irons <trevorirons@xxxxxxxxx>:
>> OK, I made the changes you suggested/requested. Non SSE is still calling
>> std::log10. Is this what you wanted?
>
> Yes, this is good.
>
> But I have a couple of remarks:
>
> +/** \internal \returns the log10 of \a a (coeff-wise) */
> +template<typename Scalar, typename Packet> inline static Packet
> ei_plog10(Packet a)
> +{ return ei_pmul(ei_pset1(Scalar(1)/log(Scalar(10))), ei_plog(a)); }
> +
>
> This function should take only the template parameter Packet. Indeed
> it's going to be called as
>
> ei_plog10(somepacket)
>
> and automatic template param determination will only work if it only
> takes a Packet templ param. That means that you need to be able to
> determine the Scalar type from the Packet type. For this, do:
>
> typedef typename ei_unpacket_traits<Packet>::type Scalar;
>
> also, the caching of the constant value needs to go here. That may
> mean that you have to do it yourself, without the macro. Fortunately,
> it's easy: this macro was doing:
>
> #define _EIGEN_DECLARE_CONST_Packet4f(NAME,X) \
> const Packet4f ei_p4f_##NAME = ei_pset1<float>(X)
>
> so just do yourself:
>
> const Packet myconstant = ei_pset1(blabla);
>
> (again, if blabla has type Scalar, ei_pset1<Scalar> will be selected.)
>
> Then, this:
>
> +static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog10(Packet4f x)
> +{
> + _EIGEN_DECLARE_CONST_Packet4f( invlog_0p1, 4.342944819032518721E-1f);
> + return ei_pmul(ei_p4f_invlog_0p1, ei_plog(x));
> +}
> +
>
> is not needed. My point was that this can all be done in a totally
> generic way, not specific to SSE, not specific to float. You don't
> have to change any file under SSE/. Your log10 function will be
> vectorized on all platforms and for all types where we vectorize log.
> That matters, because soon enough we'll probably vectorize it on other
> vectorization platforms such as NEON!
>
> The rest looks fine to me.
>
>>
>> I removed the HasLog10 and added the cost of 1 multiply to the cost in
>> Fuctors.h to reflect the multipy. Is this accurate?
>
> Oh yes it is accurate... more accurate than was needed! These costs
> only need to be very rough. One could then discuss whether in the
> non-vectorized case this is accurate since log10 could be implemented
> to be as fast as log (with a custom approximant). But your formula is
> at least accurate in the vectorized case :)
>
>>
>> Thanks for holding my hand on this.
>
> Thanks for the good work,
> Benoit
>
>>
>> -Trevor
>>
>>
>> On 7 June 2010 18:06, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>
>>> 2010/6/7 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> > ok so time for another answer :)
>>> >
>>> > The only thing that I really don't want is for us to add
>>> > SSE/AltiVec/whatever code for log10. We must reuse the existing code
>>> > for log there.
>>> >
>>> > Then there are two ways:
>>> > a) the way of your patch: add a new custom functor for log10
>>> > b) the way i suggested: just let log10 return a product expression.
>>> >
>>> > Your way a) has the advantage of sparing one multiplication (at least
>>> > in the non vectorized case), but b) means less extra code.
>>> >
>>> > Now Hauke raises a good point, that I overlooked: even with a) you can
>>> > still reuse our vectorization code for log10. So actually, I'm OK with
>>> > your approach, provided that we do that. You can do it generically for
>>> > all types at once (Hauke showed the way for float but there's no
>>> > reason to special-case it): edit Core/GenericPacketMath.h and
>>> > implement ei_plog10 there like the other functions, returning
>>> > ei_pmul(
>>> > ei_pset1(log(Scalar(0.1))),
>>>
>>> woooooops!!
>>>
>>> replace log(Scalar(0.1)) by Scalar(1) / log(Scalar(10))
>>>
>>> and sorry Hauke, I saw too late that you already wrote that. Though I
>>> guess there's value in insisting that it's faster to multiply by a
>>> constant, than to divide by a constant.
>>>
>>> Benoit
>>>
>>> > ei_plog(x)
>>> > )
>>> > The next problem is to cache the result of ei_pset1(log(Scalar(0.1))),
>>> > probably a static constant, maybe have a look at what's done in
>>> > ei_plog itself in SSE/. If that sounds obscure, leave that for later.
>>> >
>>> > In any case, we don't need a new HasLog10. Since we'll be using
>>> > ei_plog, all we need is the existing HasLog.
>>> >
>>> > Benoit
>>> >
>>> > 2010/6/7 Trevor Irons <trevorirons@xxxxxxxxx>:
>>> >> OK, thanks. I'm on a really old machine and can't check sse here. I'll
>>> >> test
>>> >> your code and generate another patch. Unless I hear that is is
>>> >> unwanted.
>>> >>
>>> >> (This just goes in eigenNes/Eigen/src/Core/arch/SSE/MathFunctions.h)
>>> >> right?
>>> >>
>>> >> -Trevor
>>> >>
>>> >> On 7 June 2010 15:46, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx> wrote:
>>> >>>
>>> >>> Let me try to answer for Benoit. :)
>>> >>>
>>> >>> On Mon, Jun 7, 2010 at 11:20 PM, Trevor Irons <trevorirons@xxxxxxxxx>
>>> >>> wrote:
>>> >>> > Sure, I can do this. I prefer store a minimum of constants. But if
>>> >>> > it
>>> >>> > will
>>> >>> > be faster, then sure.
>>> >>> >
>>> >>> > Why won't calls to std::log10 be vectorized? No sse instructions?
>>> >>>
>>> >>> This is because std::log is not right away available as an SSE
>>> >>> intrinsic and we have a special version based on the code from here:
>>> >>> http://gruntthepeon.free.fr/ssemath/
>>> >>>
>>> >>> Nonetheless I think your patch is the right way to go. We could
>>> >>> achieve easily what Benoit suggests by offering a version of ei_plog10
>>> >>> such as
>>> >>>
>>> >>> Packet4f ei_plog10(Packet4f x)
>>> >>> {
>>> >>> const Packet4f ei_p4f_log10 = ei_plog(ei_pset1(10.0f));
>>> >>> return ei_pdiv(ei_plog(x), ei_p4f_log10);
>>> >>> }
>>> >>>
>>> >>> which uses "Packet4f ei_plog(Packet4f x)". The constant would probably
>>> >>> be taken care of by a decent compiler, i.e. I don't think we need
>>> >>> manual caching.
>>> >>>
>>> >>> > Also, is there any desire to have log10 defined your way in Eigen,
>>> >>> > for
>>> >>> > convenience? Or are you just suggesting I do this locally?
>>> >>>
>>> >>> I am pretty sure he meant that it would be nice to really have this in
>>> >>> Eigen and not just in a local copy... though I am wildly guessing.
>>> >>>
>>> >>> - Hauke
>>> >>>
>>> >>> p.s. I am pretty tired an hope that I picked the right conversion
>>> >>> factor
>>> >>> ...
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>>
>>
>>
>