Re: [eigen] log10 support?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,
Sorry for the delay replying: I totally missed your reply in the flow
of all what just happened here! (2.0.13 etc). Sorry.


2010/6/9 Trevor Irons <trevorirons@xxxxxxxxx>:
> So, i'm compiling on g++ 4.5. According to this
>
> http://gcc.gnu.org/gcc-4.3/porting_to.html
> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#605
>
> Declarations like this
>
> template<>
> static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f
> ei_plog<Packet4f>(const Packet4f& x)
>
> Will result in a compiler error:
> eigen/Eigen/src/Core/arch/SSE/MathFunctions.h:35:1: error: explicit template
> specialization cannot have a storage class
>
> Which is what I'm seeing. Just changing to call by reference did not in
> itself fix the problem.


I just committed such changes. I met this error too, the solution was
to remove the 'static' from the template specializations.

Hopefully you are un-stuck now? Sorry again for the delay.

Benoit


>
> What I don't understand is that within arch/SSE/MathFunctions.h in the
> specialized case I had before:
>
> static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog10(Packet4f x)
> {
>    _EIGEN_DECLARE_CONST_Packet4f( invlog_0p1, 4.342944819032518721E-1f);
>    return  ei_pmul(ei_p4f_invlog_0p1,  ei_plog(x));
> }
>
> Calling ei_plog isn't a problem. In order to call this, I do recall having
> to move ei_plog10 to being after the Packet4f implimentation of plog. Which
> got me to thinking, dangerous.
>
> The only way I can get ei_plog10 to compile (and work, but I havgen't cached
> the constant) is forward declaring the packet4f specialization?
>
> So this compiles and gives a correct answer, and is calling the specialized
> ei_log function in SSE. But it just seems wrong.
>
> // in GenericPacketMath.h
> /** \internal \returns the log10 of \a a (coeff-wise) */
> // TI, forward declare... shooting from the hip
> typedef __m128  Packet4f;
> static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog(const Packet4f& xi);
> template<typename Packet> inline static Packet ei_plog10(Packet a)
> {
>   typedef typename ei_unpacket_traits<Packet>::type Scalar;
>   return  ei_pmul(ei_pset1(Scalar(1)/ei_log(Scalar(10))), ei_plog(a));
> }
>
> Any suggestions on how to avoid this forward declaration like this? Or is
> this OK and necessary? I'm sure its obvious that this is pushing my
> knowledge of C++ pretty hard.
>
> Thanks so much for the lesson :)
>
> -Trevor.
>
> On 8 June 2010 20:25, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>
>> 2010/6/8 Trevor Irons <trevorirons@xxxxxxxxx>:
>> > Thanks, but I seem to be incapable here.
>> >
>> > When I get rid of the float case I can't compile the most basic generic
>> > implimentation.
>> >
>> > For example:
>> > template<typename Packet>
>> > inline static Packet ei_plog10(Packet a)
>> > {
>> >   //typedef typename ei_unpacket_traits<Packet>::type Scalar;
>> >   //return  ei_pmul(ei_pset1(Scalar(10)/ei_log(Scalar(10))),
>> > ei_plog(a));
>>
>> Note that the first 10 above should be 1 :) Of course it's not what is
>> causing your current problems.
>>
>> >   return ei_plog(a); // wrong, testing
>> > }
>> >
>> > Doesn't compile because ei_plog(a) is calling std::log it seems.
>> > gcc error:
>> > /eigen/Eigen/src/Core/MathFunctions.h:573:22: error: no matching
>> > function
>> > for call to ‘log(const __vector(4) float&)’
>>
>> OK so here's some explanation. Your code above is (rightly) calling
>> ei_plog(packet). See in GenericPacketMath.h the generic definition of
>> that function:
>>
>> /** \internal \returns the log of \a a (coeff-wise) */
>> template<typename Packet> inline static Packet ei_plog(Packet a) {
>> return ei_log(a); }
>>
>> So the generic version is to just call ei_log, our wrapper for the
>> standard std::log function that only makes sense for scalars, not for
>> packets.
>>
>> Thus, given, that it actually calls this function, it's not surprising
>> that it's failing to compile. So are we crazy to have suggested you to
>> write this code? No!! Because, ta-daaaah, we are overriding the
>> generic ei_plog by a vectorized one in arch/SSE/MathFunctions.h. (We
>> are only overriding it for the Scalar type 'float', not other types
>> such as 'double', and we only do that on SSE, not other platforms. To
>> signify that we have a working ei_plog for packets, we have the HasLog
>> enum in ei_packet_traits<T>).
>>
>> Your compile error means that the compiler fails to find our ei_plog
>> specialization in arch/SSE/. Here's how this function is defined
>> there:
>>
>> static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog(Packet4f x)
>> {
>>
>> What strikes me there is that it's not defined as a template
>> specialization of the ei_plog template. The second thing that strikes
>> me is that, while the functions in GenericPacketMath.h take const
>> Packet& arguments, this one takes a Packet argument by value. I think
>> that both of these things need to be fixed (and can easily explain the
>> trouble you're having). So, in arch/SSE/MathFunctions.h, can you try
>> replacing the above line by
>>
>> template<>
>> static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f
>> ei_plog<Packet4f>(const Packet4f& x)
>> {
>>
>> If that works, please apply the same therapy to the other functions in
>> that file (I'd like us to do that anyway unless someone objects?)
>>
>> Otherwise please post your current patch
>>
>> Cheers,
>> Benoit
>>
>> >
>> > I also tried
>> > return ei_log(a), but still no dice, similiar output from compiler.
>> >
>> > However, calling pmul is fine. (Just trying to understand)
>> > // Just for testing
>> > template<typename Packet>
>> > inline static Packet ei_plog10(Packet a)
>> > {
>> >   return ei_pmul(a, a);   // This compiles just fine and gives 'correct'
>> > answers
>> > }
>> >
>> > What am I doing wrong here? How is ei_plog different from ei_pmul that
>> > one
>> > is callable and one isn't?
>> >
>> > Thanks for any insight. Sorry to be wasting all your time with such a
>> > simple
>> > thing.
>> >
>> > -Trevor
>> >
>> > On 8 June 2010 06:03, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> >>
>> >> 2010/6/8 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> >> > 2010/6/8 Trevor Irons <trevorirons@xxxxxxxxx>:
>> >> >> OK, I made the changes you suggested/requested. Non SSE is still
>> >> >> calling
>> >> >> std::log10. Is this what you wanted?
>> >> >
>> >> > Yes, this is good.
>> >> >
>> >> > But I have a couple of remarks:
>> >> >
>> >> > +/** \internal \returns the log10 of \a a (coeff-wise) */
>> >> > +template<typename Scalar, typename Packet> inline static Packet
>> >> > ei_plog10(Packet a)
>> >> > +{  return  ei_pmul(ei_pset1(Scalar(1)/log(Scalar(10))), ei_plog(a));
>> >> > }
>> >> > +
>> >> >
>> >> > This function should take only the template parameter Packet. Indeed
>> >> > it's going to be called as
>> >> >
>> >> >    ei_plog10(somepacket)
>> >> >
>> >> > and automatic template param determination will only work if it only
>> >> > takes a Packet templ param. That means that you need to be able to
>> >> > determine the Scalar type from the Packet type. For this, do:
>> >> >
>> >> >    typedef typename ei_unpacket_traits<Packet>::type Scalar;
>> >> >
>> >> > also, the caching of the constant value needs to go here. That may
>> >> > mean that you have to do it yourself, without the macro. Fortunately,
>> >> > it's easy: this macro was doing:
>> >> >
>> >> > #define _EIGEN_DECLARE_CONST_Packet4f(NAME,X) \
>> >> >  const Packet4f ei_p4f_##NAME = ei_pset1<float>(X)
>> >>
>> >> Question, especially to Gael: do you confirm that this is enough to
>> >> get caching? Naively I would have thought that 'static' is needed
>> >> there. But with compiler optimizations, one never knows!
>> >>
>> >> Benoit
>> >>
>> >> >
>> >> > so just do yourself:
>> >> >
>> >> >  const Packet myconstant = ei_pset1(blabla);
>> >> >
>> >> > (again, if blabla has type Scalar, ei_pset1<Scalar> will be
>> >> > selected.)
>> >> >
>> >> > Then, this:
>> >> >
>> >> > +static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog10(Packet4f x)
>> >> > +{
>> >> > +  _EIGEN_DECLARE_CONST_Packet4f( invlog_0p1,
>> >> > 4.342944819032518721E-1f);
>> >> > +  return  ei_pmul(ei_p4f_invlog_0p1,  ei_plog(x));
>> >> > +}
>> >> > +
>> >> >
>> >> > is not needed. My point was that this can all be done in a totally
>> >> > generic way, not specific to SSE, not specific to float. You don't
>> >> > have to change any file under SSE/. Your log10 function will be
>> >> > vectorized on all platforms and for all types where we vectorize log.
>> >> > That matters, because soon enough we'll probably vectorize it on
>> >> > other
>> >> > vectorization platforms such as NEON!
>> >> >
>> >> > The rest looks fine to me.
>> >> >
>> >> >>
>> >> >> I removed the HasLog10 and added the cost of 1 multiply to the cost
>> >> >> in
>> >> >> Fuctors.h to reflect the multipy. Is this accurate?
>> >> >
>> >> > Oh yes it is accurate... more accurate than was needed! These costs
>> >> > only need to be very rough. One could then discuss whether in the
>> >> > non-vectorized case this is accurate since log10 could be implemented
>> >> > to be as fast as log (with a custom approximant). But your formula is
>> >> > at least accurate in the vectorized case :)
>> >> >
>> >> >>
>> >> >> Thanks for holding my hand on this.
>> >> >
>> >> > Thanks for the good work,
>> >> > Benoit
>> >> >
>> >> >>
>> >> >> -Trevor
>> >> >>
>> >> >>
>> >> >> On 7 June 2010 18:06, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> >> >>>
>> >> >>> 2010/6/7 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> >> >>> > ok so time for another answer :)
>> >> >>> >
>> >> >>> > The only thing that I really don't want is for us to add
>> >> >>> > SSE/AltiVec/whatever code for log10. We must reuse the existing
>> >> >>> > code
>> >> >>> > for log there.
>> >> >>> >
>> >> >>> > Then there are two ways:
>> >> >>> > a) the way of your patch: add a new custom functor for log10
>> >> >>> > b) the way i suggested: just let log10 return a product
>> >> >>> > expression.
>> >> >>> >
>> >> >>> > Your way a) has the advantage of sparing one multiplication (at
>> >> >>> > least
>> >> >>> > in the non vectorized case), but b) means less extra code.
>> >> >>> >
>> >> >>> > Now Hauke raises a good point, that I overlooked: even with a)
>> >> >>> > you
>> >> >>> > can
>> >> >>> > still reuse our vectorization code for log10. So actually, I'm OK
>> >> >>> > with
>> >> >>> > your approach, provided that we do that. You can do it
>> >> >>> > generically
>> >> >>> > for
>> >> >>> > all types at once (Hauke showed the way for float but there's no
>> >> >>> > reason to special-case it): edit Core/GenericPacketMath.h and
>> >> >>> > implement ei_plog10 there like the other functions, returning
>> >> >>> >  ei_pmul(
>> >> >>> >    ei_pset1(log(Scalar(0.1))),
>> >> >>>
>> >> >>> woooooops!!
>> >> >>>
>> >> >>> replace log(Scalar(0.1)) by Scalar(1) / log(Scalar(10))
>> >> >>>
>> >> >>> and sorry Hauke, I saw too late that you already wrote that. Though
>> >> >>> I
>> >> >>> guess there's value in insisting that it's faster to multiply by a
>> >> >>> constant, than to divide by a constant.
>> >> >>>
>> >> >>> Benoit
>> >> >>>
>> >> >>> >    ei_plog(x)
>> >> >>> >  )
>> >> >>> > The next problem is to cache the result of
>> >> >>> > ei_pset1(log(Scalar(0.1))),
>> >> >>> > probably a static constant, maybe have a look at what's done in
>> >> >>> > ei_plog itself in SSE/. If that sounds obscure, leave that for
>> >> >>> > later.
>> >> >>> >
>> >> >>> > In any case, we don't need a new HasLog10. Since we'll be using
>> >> >>> > ei_plog, all we need is the existing HasLog.
>> >> >>> >
>> >> >>> > Benoit
>> >> >>> >
>> >> >>> > 2010/6/7 Trevor Irons <trevorirons@xxxxxxxxx>:
>> >> >>> >> OK, thanks. I'm on a really old machine and can't check sse
>> >> >>> >> here.
>> >> >>> >> I'll
>> >> >>> >> test
>> >> >>> >> your code and generate another patch. Unless I hear that is is
>> >> >>> >> unwanted.
>> >> >>> >>
>> >> >>> >> (This just goes in
>> >> >>> >> eigenNes/Eigen/src/Core/arch/SSE/MathFunctions.h)
>> >> >>> >> right?
>> >> >>> >>
>> >> >>> >> -Trevor
>> >> >>> >>
>> >> >>> >> On 7 June 2010 15:46, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>
>> >> >>> >> wrote:
>> >> >>> >>>
>> >> >>> >>> Let me try to answer for Benoit. :)
>> >> >>> >>>
>> >> >>> >>> On Mon, Jun 7, 2010 at 11:20 PM, Trevor Irons
>> >> >>> >>> <trevorirons@xxxxxxxxx>
>> >> >>> >>> wrote:
>> >> >>> >>> > Sure, I can do this. I prefer store a minimum of constants.
>> >> >>> >>> > But
>> >> >>> >>> > if
>> >> >>> >>> > it
>> >> >>> >>> > will
>> >> >>> >>> > be faster, then sure.
>> >> >>> >>> >
>> >> >>> >>> > Why won't calls to std::log10 be vectorized? No sse
>> >> >>> >>> > instructions?
>> >> >>> >>>
>> >> >>> >>> This is because std::log is not right away available as an SSE
>> >> >>> >>> intrinsic and we have a special version based on the code from
>> >> >>> >>> here:
>> >> >>> >>> http://gruntthepeon.free.fr/ssemath/
>> >> >>> >>>
>> >> >>> >>> Nonetheless I think your patch is the right way to go. We could
>> >> >>> >>> achieve easily what Benoit suggests by offering a version of
>> >> >>> >>> ei_plog10
>> >> >>> >>> such as
>> >> >>> >>>
>> >> >>> >>> Packet4f ei_plog10(Packet4f x)
>> >> >>> >>> {
>> >> >>> >>>  const Packet4f ei_p4f_log10 = ei_plog(ei_pset1(10.0f));
>> >> >>> >>>  return ei_pdiv(ei_plog(x), ei_p4f_log10);
>> >> >>> >>> }
>> >> >>> >>>
>> >> >>> >>> which uses "Packet4f ei_plog(Packet4f x)". The constant would
>> >> >>> >>> probably
>> >> >>> >>> be taken care of by a decent compiler, i.e. I don't think we
>> >> >>> >>> need
>> >> >>> >>> manual caching.
>> >> >>> >>>
>> >> >>> >>> > Also, is there any desire to have log10 defined your way in
>> >> >>> >>> > Eigen,
>> >> >>> >>> > for
>> >> >>> >>> > convenience? Or are you just suggesting I do this locally?
>> >> >>> >>>
>> >> >>> >>> I am pretty sure he meant that it would be nice to really have
>> >> >>> >>> this in
>> >> >>> >>> Eigen and not just in a local copy... though I am wildly
>> >> >>> >>> guessing.
>> >> >>> >>>
>> >> >>> >>> - Hauke
>> >> >>> >>>
>> >> >>> >>> p.s. I am pretty tired an hope that I picked the right
>> >> >>> >>> conversion
>> >> >>> >>> factor
>> >> >>> >>> ...
>> >> >>> >>>
>> >> >>> >>>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >
>> >> >>>
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >>
>> >>
>> >
>> >
>>
>>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/