Re: [eigen] log10 support?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Thanks, but I seem to be incapable here.

When I get rid of the float case I can't compile the most basic generic implimentation.

For example:
template<typename Packet>
inline static Packet ei_plog10(Packet a)       

  //typedef typename ei_unpacket_traits<Packet>::type Scalar;
  //return  ei_pmul(ei_pset1(Scalar(10)/ei_log(Scalar(10))), ei_plog(a));
  return ei_plog(a); // wrong, testing
}

Doesn't compile because ei_plog(a) is calling std::log it seems.
gcc error:
/eigen/Eigen/src/Core/MathFunctions.h:573:22: error: no matching function for call to ‘log(const __vector(4) float&)’

I also tried
return ei_log(a), but still no dice, similiar output from compiler.

However, calling pmul is fine. (Just trying to understand)
// Just for testing
template<typename Packet>
inline static Packet ei_plog10(Packet a)       

  return ei_pmul(a, a);   // This compiles just fine and gives 'correct' answers
}

What am I doing wrong here? How is ei_plog different from ei_pmul that one is callable and one isn't?

Thanks for any insight. Sorry to be wasting all your time with such a simple thing.

-Trevor

On 8 June 2010 06:03, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
2010/6/8 Benoit Jacob <jacob..benoit.1@xxxxxxxxx>:
> 2010/6/8 Trevor Irons <trevorirons@xxxxxxxxx>:
>> OK, I made the changes you suggested/requested. Non SSE is still calling
>> std::log10. Is this what you wanted?
>
> Yes, this is good.
>
> But I have a couple of remarks:
>
> +/** \internal \returns the log10 of \a a (coeff-wise) */
> +template<typename Scalar, typename Packet> inline static Packet
> ei_plog10(Packet a)
> +{  return  ei_pmul(ei_pset1(Scalar(1)/log(Scalar(10))), ei_plog(a)); }
> +
>
> This function should take only the template parameter Packet. Indeed
> it's going to be called as
>
>    ei_plog10(somepacket)
>
> and automatic template param determination will only work if it only
> takes a Packet templ param. That means that you need to be able to
> determine the Scalar type from the Packet type. For this, do:
>
>    typedef typename ei_unpacket_traits<Packet>::type Scalar;
>
> also, the caching of the constant value needs to go here. That may
> mean that you have to do it yourself, without the macro. Fortunately,
> it's easy: this macro was doing:
>
> #define _EIGEN_DECLARE_CONST_Packet4f(NAME,X) \
>  const Packet4f ei_p4f_##NAME = ei_pset1<float>(X)

Question, especially to Gael: do you confirm that this is enough to
get caching? Naively I would have thought that 'static' is needed
there. But with compiler optimizations, one never knows!

Benoit

>
> so just do yourself:
>
>  const Packet myconstant = ei_pset1(blabla);
>
> (again, if blabla has type Scalar, ei_pset1<Scalar> will be selected.)
>
> Then, this:
>
> +static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog10(Packet4f x)
> +{
> +  _EIGEN_DECLARE_CONST_Packet4f( invlog_0p1, 4.342944819032518721E-1f);
> +  return  ei_pmul(ei_p4f_invlog_0p1,  ei_plog(x));
> +}
> +
>
> is not needed. My point was that this can all be done in a totally
> generic way, not specific to SSE, not specific to float. You don't
> have to change any file under SSE/. Your log10 function will be
> vectorized on all platforms and for all types where we vectorize log.
> That matters, because soon enough we'll probably vectorize it on other
> vectorization platforms such as NEON!
>
> The rest looks fine to me.
>
>>
>> I removed the HasLog10 and added the cost of 1 multiply to the cost in
>> Fuctors.h to reflect the multipy. Is this accurate?
>
> Oh yes it is accurate... more accurate than was needed! These costs
> only need to be very rough. One could then discuss whether in the
> non-vectorized case this is accurate since log10 could be implemented
> to be as fast as log (with a custom approximant). But your formula is
> at least accurate in the vectorized case :)
>
>>
>> Thanks for holding my hand on this.
>
> Thanks for the good work,
> Benoit
>
>>
>> -Trevor
>>
>>
>> On 7 June 2010 18:06, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>
>>> 2010/6/7 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> > ok so time for another answer :)
>>> >
>>> > The only thing that I really don't want is for us to add
>>> > SSE/AltiVec/whatever code for log10. We must reuse the existing code
>>> > for log there.
>>> >
>>> > Then there are two ways:
>>> > a) the way of your patch: add a new custom functor for log10
>>> > b) the way i suggested: just let log10 return a product _expression_.
>>> >
>>> > Your way a) has the advantage of sparing one multiplication (at least
>>> > in the non vectorized case), but b) means less extra code..
>>> >
>>> > Now Hauke raises a good point, that I overlooked: even with a) you can
>>> > still reuse our vectorization code for log10. So actually, I'm OK with
>>> > your approach, provided that we do that. You can do it generically for
>>> > all types at once (Hauke showed the way for float but there's no
>>> > reason to special-case it): edit Core/GenericPacketMath.h and
>>> > implement ei_plog10 there like the other functions, returning
>>> >  ei_pmul(
>>> >    ei_pset1(log(Scalar(0.1))),
>>>
>>> woooooops!!
>>>
>>> replace log(Scalar(0.1)) by Scalar(1) / log(Scalar(10))
>>>
>>> and sorry Hauke, I saw too late that you already wrote that. Though I
>>> guess there's value in insisting that it's faster to multiply by a
>>> constant, than to divide by a constant.
>>>
>>> Benoit
>>>
>>> >    ei_plog(x)
>>> >  )
>>> > The next problem is to cache the result of ei_pset1(log(Scalar(0.1))),
>>> > probably a static constant, maybe have a look at what's done in
>>> > ei_plog itself in SSE/. If that sounds obscure, leave that for later.
>>> >
>>> > In any case, we don't need a new HasLog10. Since we'll be using
>>> > ei_plog, all we need is the existing HasLog.
>>> >
>>> > Benoit
>>> >
>>> > 2010/6/7 Trevor Irons <trevorirons@xxxxxxxxx>:
>>> >> OK, thanks. I'm on a really old machine and can't check sse here. I'll
>>> >> test
>>> >> your code and generate another patch. Unless I hear that is is
>>> >> unwanted.
>>> >>
>>> >> (This just goes in eigenNes/Eigen/src/Core/arch/SSE/MathFunctions.h)
>>> >> right?
>>> >>
>>> >> -Trevor
>>> >>
>>> >> On 7 June 2010 15:46, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx> wrote:
>>> >>>
>>> >>> Let me try to answer for Benoit. :)
>>> >>>
>>> >>> On Mon, Jun 7, 2010 at 11:20 PM, Trevor Irons <trevorirons@xxxxxxxxx>
>>> >>> wrote:
>>> >>> > Sure, I can do this. I prefer store a minimum of constants. But if
>>> >>> > it
>>> >>> > will
>>> >>> > be faster, then sure.
>>> >>> >
>>> >>> > Why won't calls to std::log10 be vectorized? No sse instructions?
>>> >>>
>>> >>> This is because std::log is not right away available as an SSE
>>> >>> intrinsic and we have a special version based on the code from here:
>>> >>> http://gruntthepeon.free.fr/ssemath/
>>> >>>
>>> >>> Nonetheless I think your patch is the right way to go. We could
>>> >>> achieve easily what Benoit suggests by offering a version of ei_plog10
>>> >>> such as
>>> >>>
>>> >>> Packet4f ei_plog10(Packet4f x)
>>> >>> {
>>> >>>  const Packet4f ei_p4f_log10 = ei_plog(ei_pset1(10.0f));
>>> >>>  return ei_pdiv(ei_plog(x), ei_p4f_log10);
>>> >>> }
>>> >>>
>>> >>> which uses "Packet4f ei_plog(Packet4f x)". The constant would probably
>>> >>> be taken care of by a decent compiler, i.e. I don't think we need
>>> >>> manual caching.
>>> >>>
>>> >>> > Also, is there any desire to have log10 defined your way in Eigen,
>>> >>> > for
>>> >>> > convenience? Or are you just suggesting I do this locally?
>>> >>>
>>> >>> I am pretty sure he meant that it would be nice to really have this in
>>> >>> Eigen and not just in a local copy... though I am wildly guessing.
>>> >>>
>>> >>> - Hauke
>>> >>>
>>> >>> p.s. I am pretty tired an hope that I picked the right conversion
>>> >>> factor
>>> >>> ...
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>>
>>
>>
>





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/