Re: [eigen] log10 support? |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] log10 support?*From*: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>*Date*: Tue, 15 Jun 2010 08:31:41 -0400*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=xqDraokJsVLJ6pBa4IcDD0ZyN2U40TA5AAq40C7jLkg=; b=h1+jni0ymalDC/tS81DXq/C+nXwHStiNrJJzyjroYluoN9zOcPdvmJwG69cyGS/3MV 1qgLoxYj24PQIabUgrI5A8Ly9UpLvDqkQSjb4AhtudiItbAV218h1t7hXt+IikNQJr6j PGlqQOMDPwV0TgGQMqhHVqUxX5qVrCQekT6d0=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=NsljQC3yUSKyVzR+IJ+D+Hp2lMdBJw1+yNecpKAOQH3w9TyTnKv7d6aZMsUlWFAlzv ircgT4GNMuC4RGQYhOZecZYLhR1LmyP7aOw8pHFBdOx6nMcZ+nDk6/AJ6GRm+IOumN7Y RGkQq0ycqfyOJPfsryiKe1W+i3tnc5xZ3M6ZM=

Hi, Sorry for the delay replying: I totally missed your reply in the flow of all what just happened here! (2.0.13 etc). Sorry. 2010/6/9 Trevor Irons <trevorirons@xxxxxxxxx>: > So, i'm compiling on g++ 4.5. According to this > > http://gcc.gnu.org/gcc-4.3/porting_to.html > http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#605 > > Declarations like this > > template<> > static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f > ei_plog<Packet4f>(const Packet4f& x) > > Will result in a compiler error: > eigen/Eigen/src/Core/arch/SSE/MathFunctions.h:35:1: error: explicit template > specialization cannot have a storage class > > Which is what I'm seeing. Just changing to call by reference did not in > itself fix the problem. I just committed such changes. I met this error too, the solution was to remove the 'static' from the template specializations. Hopefully you are un-stuck now? Sorry again for the delay. Benoit > > What I don't understand is that within arch/SSE/MathFunctions.h in the > specialized case I had before: > > static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog10(Packet4f x) > { > _EIGEN_DECLARE_CONST_Packet4f( invlog_0p1, 4.342944819032518721E-1f); > return ei_pmul(ei_p4f_invlog_0p1, ei_plog(x)); > } > > Calling ei_plog isn't a problem. In order to call this, I do recall having > to move ei_plog10 to being after the Packet4f implimentation of plog. Which > got me to thinking, dangerous. > > The only way I can get ei_plog10 to compile (and work, but I havgen't cached > the constant) is forward declaring the packet4f specialization? > > So this compiles and gives a correct answer, and is calling the specialized > ei_log function in SSE. But it just seems wrong. > > // in GenericPacketMath.h > /** \internal \returns the log10 of \a a (coeff-wise) */ > // TI, forward declare... shooting from the hip > typedef __m128 Packet4f; > static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog(const Packet4f& xi); > template<typename Packet> inline static Packet ei_plog10(Packet a) > { > typedef typename ei_unpacket_traits<Packet>::type Scalar; > return ei_pmul(ei_pset1(Scalar(1)/ei_log(Scalar(10))), ei_plog(a)); > } > > Any suggestions on how to avoid this forward declaration like this? Or is > this OK and necessary? I'm sure its obvious that this is pushing my > knowledge of C++ pretty hard. > > Thanks so much for the lesson :) > > -Trevor. > > On 8 June 2010 20:25, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote: >> >> 2010/6/8 Trevor Irons <trevorirons@xxxxxxxxx>: >> > Thanks, but I seem to be incapable here. >> > >> > When I get rid of the float case I can't compile the most basic generic >> > implimentation. >> > >> > For example: >> > template<typename Packet> >> > inline static Packet ei_plog10(Packet a) >> > { >> > //typedef typename ei_unpacket_traits<Packet>::type Scalar; >> > //return ei_pmul(ei_pset1(Scalar(10)/ei_log(Scalar(10))), >> > ei_plog(a)); >> >> Note that the first 10 above should be 1 :) Of course it's not what is >> causing your current problems. >> >> > return ei_plog(a); // wrong, testing >> > } >> > >> > Doesn't compile because ei_plog(a) is calling std::log it seems. >> > gcc error: >> > /eigen/Eigen/src/Core/MathFunctions.h:573:22: error: no matching >> > function >> > for call to ‘log(const __vector(4) float&)’ >> >> OK so here's some explanation. Your code above is (rightly) calling >> ei_plog(packet). See in GenericPacketMath.h the generic definition of >> that function: >> >> /** \internal \returns the log of \a a (coeff-wise) */ >> template<typename Packet> inline static Packet ei_plog(Packet a) { >> return ei_log(a); } >> >> So the generic version is to just call ei_log, our wrapper for the >> standard std::log function that only makes sense for scalars, not for >> packets. >> >> Thus, given, that it actually calls this function, it's not surprising >> that it's failing to compile. So are we crazy to have suggested you to >> write this code? No!! Because, ta-daaaah, we are overriding the >> generic ei_plog by a vectorized one in arch/SSE/MathFunctions.h. (We >> are only overriding it for the Scalar type 'float', not other types >> such as 'double', and we only do that on SSE, not other platforms. To >> signify that we have a working ei_plog for packets, we have the HasLog >> enum in ei_packet_traits<T>). >> >> Your compile error means that the compiler fails to find our ei_plog >> specialization in arch/SSE/. Here's how this function is defined >> there: >> >> static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog(Packet4f x) >> { >> >> What strikes me there is that it's not defined as a template >> specialization of the ei_plog template. The second thing that strikes >> me is that, while the functions in GenericPacketMath.h take const >> Packet& arguments, this one takes a Packet argument by value. I think >> that both of these things need to be fixed (and can easily explain the >> trouble you're having). So, in arch/SSE/MathFunctions.h, can you try >> replacing the above line by >> >> template<> >> static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f >> ei_plog<Packet4f>(const Packet4f& x) >> { >> >> If that works, please apply the same therapy to the other functions in >> that file (I'd like us to do that anyway unless someone objects?) >> >> Otherwise please post your current patch >> >> Cheers, >> Benoit >> >> > >> > I also tried >> > return ei_log(a), but still no dice, similiar output from compiler. >> > >> > However, calling pmul is fine. (Just trying to understand) >> > // Just for testing >> > template<typename Packet> >> > inline static Packet ei_plog10(Packet a) >> > { >> > return ei_pmul(a, a); // This compiles just fine and gives 'correct' >> > answers >> > } >> > >> > What am I doing wrong here? How is ei_plog different from ei_pmul that >> > one >> > is callable and one isn't? >> > >> > Thanks for any insight. Sorry to be wasting all your time with such a >> > simple >> > thing. >> > >> > -Trevor >> > >> > On 8 June 2010 06:03, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote: >> >> >> >> 2010/6/8 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>: >> >> > 2010/6/8 Trevor Irons <trevorirons@xxxxxxxxx>: >> >> >> OK, I made the changes you suggested/requested. Non SSE is still >> >> >> calling >> >> >> std::log10. Is this what you wanted? >> >> > >> >> > Yes, this is good. >> >> > >> >> > But I have a couple of remarks: >> >> > >> >> > +/** \internal \returns the log10 of \a a (coeff-wise) */ >> >> > +template<typename Scalar, typename Packet> inline static Packet >> >> > ei_plog10(Packet a) >> >> > +{ return ei_pmul(ei_pset1(Scalar(1)/log(Scalar(10))), ei_plog(a)); >> >> > } >> >> > + >> >> > >> >> > This function should take only the template parameter Packet. Indeed >> >> > it's going to be called as >> >> > >> >> > ei_plog10(somepacket) >> >> > >> >> > and automatic template param determination will only work if it only >> >> > takes a Packet templ param. That means that you need to be able to >> >> > determine the Scalar type from the Packet type. For this, do: >> >> > >> >> > typedef typename ei_unpacket_traits<Packet>::type Scalar; >> >> > >> >> > also, the caching of the constant value needs to go here. That may >> >> > mean that you have to do it yourself, without the macro. Fortunately, >> >> > it's easy: this macro was doing: >> >> > >> >> > #define _EIGEN_DECLARE_CONST_Packet4f(NAME,X) \ >> >> > const Packet4f ei_p4f_##NAME = ei_pset1<float>(X) >> >> >> >> Question, especially to Gael: do you confirm that this is enough to >> >> get caching? Naively I would have thought that 'static' is needed >> >> there. But with compiler optimizations, one never knows! >> >> >> >> Benoit >> >> >> >> > >> >> > so just do yourself: >> >> > >> >> > const Packet myconstant = ei_pset1(blabla); >> >> > >> >> > (again, if blabla has type Scalar, ei_pset1<Scalar> will be >> >> > selected.) >> >> > >> >> > Then, this: >> >> > >> >> > +static EIGEN_DONT_INLINE EIGEN_UNUSED Packet4f ei_plog10(Packet4f x) >> >> > +{ >> >> > + _EIGEN_DECLARE_CONST_Packet4f( invlog_0p1, >> >> > 4.342944819032518721E-1f); >> >> > + return ei_pmul(ei_p4f_invlog_0p1, ei_plog(x)); >> >> > +} >> >> > + >> >> > >> >> > is not needed. My point was that this can all be done in a totally >> >> > generic way, not specific to SSE, not specific to float. You don't >> >> > have to change any file under SSE/. Your log10 function will be >> >> > vectorized on all platforms and for all types where we vectorize log. >> >> > That matters, because soon enough we'll probably vectorize it on >> >> > other >> >> > vectorization platforms such as NEON! >> >> > >> >> > The rest looks fine to me. >> >> > >> >> >> >> >> >> I removed the HasLog10 and added the cost of 1 multiply to the cost >> >> >> in >> >> >> Fuctors.h to reflect the multipy. Is this accurate? >> >> > >> >> > Oh yes it is accurate... more accurate than was needed! These costs >> >> > only need to be very rough. One could then discuss whether in the >> >> > non-vectorized case this is accurate since log10 could be implemented >> >> > to be as fast as log (with a custom approximant). But your formula is >> >> > at least accurate in the vectorized case :) >> >> > >> >> >> >> >> >> Thanks for holding my hand on this. >> >> > >> >> > Thanks for the good work, >> >> > Benoit >> >> > >> >> >> >> >> >> -Trevor >> >> >> >> >> >> >> >> >> On 7 June 2010 18:06, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote: >> >> >>> >> >> >>> 2010/6/7 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>: >> >> >>> > ok so time for another answer :) >> >> >>> > >> >> >>> > The only thing that I really don't want is for us to add >> >> >>> > SSE/AltiVec/whatever code for log10. We must reuse the existing >> >> >>> > code >> >> >>> > for log there. >> >> >>> > >> >> >>> > Then there are two ways: >> >> >>> > a) the way of your patch: add a new custom functor for log10 >> >> >>> > b) the way i suggested: just let log10 return a product >> >> >>> > expression. >> >> >>> > >> >> >>> > Your way a) has the advantage of sparing one multiplication (at >> >> >>> > least >> >> >>> > in the non vectorized case), but b) means less extra code. >> >> >>> > >> >> >>> > Now Hauke raises a good point, that I overlooked: even with a) >> >> >>> > you >> >> >>> > can >> >> >>> > still reuse our vectorization code for log10. So actually, I'm OK >> >> >>> > with >> >> >>> > your approach, provided that we do that. You can do it >> >> >>> > generically >> >> >>> > for >> >> >>> > all types at once (Hauke showed the way for float but there's no >> >> >>> > reason to special-case it): edit Core/GenericPacketMath.h and >> >> >>> > implement ei_plog10 there like the other functions, returning >> >> >>> > ei_pmul( >> >> >>> > ei_pset1(log(Scalar(0.1))), >> >> >>> >> >> >>> woooooops!! >> >> >>> >> >> >>> replace log(Scalar(0.1)) by Scalar(1) / log(Scalar(10)) >> >> >>> >> >> >>> and sorry Hauke, I saw too late that you already wrote that. Though >> >> >>> I >> >> >>> guess there's value in insisting that it's faster to multiply by a >> >> >>> constant, than to divide by a constant. >> >> >>> >> >> >>> Benoit >> >> >>> >> >> >>> > ei_plog(x) >> >> >>> > ) >> >> >>> > The next problem is to cache the result of >> >> >>> > ei_pset1(log(Scalar(0.1))), >> >> >>> > probably a static constant, maybe have a look at what's done in >> >> >>> > ei_plog itself in SSE/. If that sounds obscure, leave that for >> >> >>> > later. >> >> >>> > >> >> >>> > In any case, we don't need a new HasLog10. Since we'll be using >> >> >>> > ei_plog, all we need is the existing HasLog. >> >> >>> > >> >> >>> > Benoit >> >> >>> > >> >> >>> > 2010/6/7 Trevor Irons <trevorirons@xxxxxxxxx>: >> >> >>> >> OK, thanks. I'm on a really old machine and can't check sse >> >> >>> >> here. >> >> >>> >> I'll >> >> >>> >> test >> >> >>> >> your code and generate another patch. Unless I hear that is is >> >> >>> >> unwanted. >> >> >>> >> >> >> >>> >> (This just goes in >> >> >>> >> eigenNes/Eigen/src/Core/arch/SSE/MathFunctions.h) >> >> >>> >> right? >> >> >>> >> >> >> >>> >> -Trevor >> >> >>> >> >> >> >>> >> On 7 June 2010 15:46, Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx> >> >> >>> >> wrote: >> >> >>> >>> >> >> >>> >>> Let me try to answer for Benoit. :) >> >> >>> >>> >> >> >>> >>> On Mon, Jun 7, 2010 at 11:20 PM, Trevor Irons >> >> >>> >>> <trevorirons@xxxxxxxxx> >> >> >>> >>> wrote: >> >> >>> >>> > Sure, I can do this. I prefer store a minimum of constants. >> >> >>> >>> > But >> >> >>> >>> > if >> >> >>> >>> > it >> >> >>> >>> > will >> >> >>> >>> > be faster, then sure. >> >> >>> >>> > >> >> >>> >>> > Why won't calls to std::log10 be vectorized? No sse >> >> >>> >>> > instructions? >> >> >>> >>> >> >> >>> >>> This is because std::log is not right away available as an SSE >> >> >>> >>> intrinsic and we have a special version based on the code from >> >> >>> >>> here: >> >> >>> >>> http://gruntthepeon.free.fr/ssemath/ >> >> >>> >>> >> >> >>> >>> Nonetheless I think your patch is the right way to go. We could >> >> >>> >>> achieve easily what Benoit suggests by offering a version of >> >> >>> >>> ei_plog10 >> >> >>> >>> such as >> >> >>> >>> >> >> >>> >>> Packet4f ei_plog10(Packet4f x) >> >> >>> >>> { >> >> >>> >>> const Packet4f ei_p4f_log10 = ei_plog(ei_pset1(10.0f)); >> >> >>> >>> return ei_pdiv(ei_plog(x), ei_p4f_log10); >> >> >>> >>> } >> >> >>> >>> >> >> >>> >>> which uses "Packet4f ei_plog(Packet4f x)". The constant would >> >> >>> >>> probably >> >> >>> >>> be taken care of by a decent compiler, i.e. I don't think we >> >> >>> >>> need >> >> >>> >>> manual caching. >> >> >>> >>> >> >> >>> >>> > Also, is there any desire to have log10 defined your way in >> >> >>> >>> > Eigen, >> >> >>> >>> > for >> >> >>> >>> > convenience? Or are you just suggesting I do this locally? >> >> >>> >>> >> >> >>> >>> I am pretty sure he meant that it would be nice to really have >> >> >>> >>> this in >> >> >>> >>> Eigen and not just in a local copy... though I am wildly >> >> >>> >>> guessing. >> >> >>> >>> >> >> >>> >>> - Hauke >> >> >>> >>> >> >> >>> >>> p.s. I am pretty tired an hope that I picked the right >> >> >>> >>> conversion >> >> >>> >>> factor >> >> >>> >>> ... >> >> >>> >>> >> >> >>> >>> >> >> >>> >> >> >> >>> >> >> >> >>> > >> >> >>> >> >> >>> >> >> >> >> >> >> >> >> > >> >> >> >> >> > >> > >> >> > >

**References**:**[eigen] log10 support?***From:*Trevor Irons

**Re: [eigen] log10 support?***From:*Benoit Jacob

**Re: [eigen] log10 support?***From:*Trevor Irons

**Re: [eigen] log10 support?***From:*Hauke Heibel

**Re: [eigen] log10 support?***From:*Trevor Irons

**Re: [eigen] log10 support?***From:*Benoit Jacob

**Re: [eigen] log10 support?***From:*Benoit Jacob

**Re: [eigen] log10 support?***From:*Trevor Irons

**Re: [eigen] log10 support?***From:*Benoit Jacob

**Re: [eigen] log10 support?***From:*Benoit Jacob

**Re: [eigen] log10 support?***From:*Trevor Irons

**Re: [eigen] log10 support?***From:*Benoit Jacob

**Re: [eigen] log10 support?***From:*Trevor Irons

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] non-linear optimization test summary** - Next by Date:
**Re: [eigen] A modest start with the documentation** - Previous by thread:
**Re: [eigen] log10 support?** - Next by thread:
**[eigen] integer types - something's odd...**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |