Re: [eigen] log10 support?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2010/6/8 Trevor Irons <trevorirons@xxxxxxxxx>:
> OK, I made the changes you suggested/requested. Non SSE is still calling
> std::log10. Is this what you wanted?

Yes, this is good.

But I have a couple of remarks:

+/** \internal \returns the log10 of \a a (coeff-wise) */
+template<typename Scalar, typename Packet> inline static Packet
ei_plog10(Packet a)
+{  return  ei_pmul(ei_pset1(Scalar(1)/log(Scalar(10))), ei_plog(a)); }
+

This function should take only the template parameter Packet. Indeed
it's going to be called as

    ei_plog10(somepacket)

and automatic template param determination will only work if it only
takes a Packet templ param. That means that you need to be able to
determine the Scalar type from the Packet type. For this, do:

    typedef typename ei_unpacket_traits<Packet>::type Scalar;

also, the caching of the constant value needs to go here. That may
mean that you have to do it yourself, without the macro. Fortunately,
it's easy: this macro was doing:

#define _EIGEN_DECLARE_CONST_Packet4f(NAME,X) \
  const Packet4f ei_p4f_##NAME = ei_pset1<float>(X)

so just do yourself:

  const Packet myconstant = ei_pset1(blabla);

(again, if blabla has type Scalar, ei_pset1<Scalar> will be selected.)

Then, this:

+static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog10(Packet4f x)
+{
+  _EIGEN_DECLARE_CONST_Packet4f( invlog_0p1, 4.342944819032518721E-1f);
+  return  ei_pmul(ei_p4f_invlog_0p1,  ei_plog(x));
+}
+

is not needed. My point was that this can all be done in a totally
generic way, not specific to SSE, not specific to float. You don't
have to change any file under SSE/. Your log10 function will be
vectorized on all platforms and for all types where we vectorize log.
That matters, because soon enough we'll probably vectorize it on other
vectorization platforms such as NEON!

The rest looks fine to me.

>
> I removed the HasLog10 and added the cost of 1 multiply to the cost in
> Fuctors.h to reflect the multipy. Is this accurate?

Oh yes it is accurate... more accurate than was needed! These costs
only need to be very rough. One could then discuss whether in the
non-vectorized case this is accurate since log10 could be implemented
to be as fast as log (with a custom approximant). But your formula is
at least accurate in the vectorized case :)

>
> Thanks for holding my hand on this.

Thanks for the good work,
Benoit

>
> -Trevor
>
>
> On 7 June 2010 18:06, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>
>> 2010/6/7 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> > ok so time for another answer :)
>> >
>> > The only thing that I really don't want is for us to add
>> > SSE/AltiVec/whatever code for log10. We must reuse the existing code
>> > for log there.
>> >
>> > Then there are two ways:
>> > a) the way of your patch: add a new custom functor for log10
>> > b) the way i suggested: just let log10 return a product expression.
>> >
>> > Your way a) has the advantage of sparing one multiplication (at least
>> > in the non vectorized case), but b) means less extra code.
>> >
>> > Now Hauke raises a good point, that I overlooked: even with a) you can
>> > still reuse our vectorization code for log10. So actually, I'm OK with
>> > your approach, provided that we do that. You can do it generically for
>> > all types at once (Hauke showed the way for float but there's no
>> > reason to special-case it): edit Core/GenericPacketMath.h and
>> > implement ei_plog10 there like the other functions, returning
>> >  ei_pmul(
>> >    ei_pset1(log(Scalar(0.1))),
>>
>> woooooops!!
>>
>> replace log(Scalar(0.1)) by Scalar(1) / log(Scalar(10))
>>
>> and sorry Hauke, I saw too late that you already wrote that. Though I
>> guess there's value in insisting that it's faster to multiply by a
>> constant, than to divide by a constant.
>>
>> Benoit
>>
>> >    ei_plog(x)
>> >  )
>> > The next problem is to cache the result of ei_pset1(log(Scalar(0.1))),
>> > probably a static constant, maybe have a look at what's done in
>> > ei_plog itself in SSE/. If that sounds obscure, leave that for later.
>> >
>> > In any case, we don't need a new HasLog10. Since we'll be using
>> > ei_plog, all we need is the existing HasLog.
>> >
>> > Benoit
>> >
>> > 2010/6/7 Trevor Irons <trevorirons@xxxxxxxxx>:
>> >> OK, thanks. I'm on a really old machine and can't check sse here. I'll
>> >> test
>> >> your code and generate another patch. Unless I hear that is is
>> >> unwanted.
>> >>
>> >> (This just goes in eigenNes/Eigen/src/Core/arch/SSE/MathFunctions.h)
>> >> right?
>> >>
>> >> -Trevor
>> >>
>> >> On 7 June 2010 15:46, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx> wrote:
>> >>>
>> >>> Let me try to answer for Benoit. :)
>> >>>
>> >>> On Mon, Jun 7, 2010 at 11:20 PM, Trevor Irons <trevorirons@xxxxxxxxx>
>> >>> wrote:
>> >>> > Sure, I can do this. I prefer store a minimum of constants. But if
>> >>> > it
>> >>> > will
>> >>> > be faster, then sure.
>> >>> >
>> >>> > Why won't calls to std::log10 be vectorized? No sse instructions?
>> >>>
>> >>> This is because std::log is not right away available as an SSE
>> >>> intrinsic and we have a special version based on the code from here:
>> >>> http://gruntthepeon.free.fr/ssemath/
>> >>>
>> >>> Nonetheless I think your patch is the right way to go. We could
>> >>> achieve easily what Benoit suggests by offering a version of ei_plog10
>> >>> such as
>> >>>
>> >>> Packet4f ei_plog10(Packet4f x)
>> >>> {
>> >>>  const Packet4f ei_p4f_log10 = ei_plog(ei_pset1(10.0f));
>> >>>  return ei_pdiv(ei_plog(x), ei_p4f_log10);
>> >>> }
>> >>>
>> >>> which uses "Packet4f ei_plog(Packet4f x)". The constant would probably
>> >>> be taken care of by a decent compiler, i.e. I don't think we need
>> >>> manual caching.
>> >>>
>> >>> > Also, is there any desire to have log10 defined your way in Eigen,
>> >>> > for
>> >>> > convenience? Or are you just suggesting I do this locally?
>> >>>
>> >>> I am pretty sure he meant that it would be nice to really have this in
>> >>> Eigen and not just in a local copy... though I am wildly guessing.
>> >>>
>> >>> - Hauke
>> >>>
>> >>> p.s. I am pretty tired an hope that I picked the right conversion
>> >>> factor
>> >>> ...
>> >>>
>> >>>
>> >>
>> >>
>> >
>>
>>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/