Re: [eigen] A complex FFT for Eigen |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] A complex FFT for Eigen*From*: "Benoit Jacob" <jacob.benoit.1@xxxxxxxxx>*Date*: Tue, 2 Dec 2008 17:25:19 +0100*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=a65C8R/epqsFRyOpI8jpD1QF36s/QyTsV+WC+mPdcbU=; b=lVkPMpTvCEtGgFzXsHf7Y/oofgOntJ++3+q3xLx6Zolm6cChcV9kf4fqSF8qcVO5H7 vbjuPmsWrHczGCkH0fck2zfjnhGnjigtihiAvEj4fMGYtorQMdgZoJkoqFIm3JibVMPu JgBnPTn2kHZKStn/O52IrD74q+K0MI6yhN+NM=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=SvtWyKrsxCb9LpwftOluP5arM0kdMzZkajTZVcOAmbjI7lvzSmPMZK/L5k2OdE9JTD 63EoRejvUnu54varooPylHF2EDWvGsI9vDn21WbK/kSq+Rc8Gh7do2Zo/G1+DIAMPG4W EObFPwxgRIo16LWZ6XvjcZ+rVrVRFsDGRi7Xc=

It seems to me that the optimal solution is to patch <complex> to vectorize std::complex<double>. What do you think? I could live of the idea of saying "yeah i've submitted a patch or two to GCC" in a party. Another option (basically, if they don't accept our patches) is to have our own <complex> having the same name. If the user hasn't already #included<complex> (as we can tell from the preprocessor symbol guarding it) then we include our own instead, and we define all possible preprocessor symbols to make sure that subsequent inclusion of <complex> by the user is blocked. Cheers, Benoit 2008/12/2 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>: > > On Tue, Dec 2, 2008 at 4:51 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> > wrote: >> >> Yes I agree, I too prefer the 1st solution: own specialization of >> complex<double>. Actually I hadn't thought of that possibility. >> >> Unfortunately, at least here in gcc 4.3, <complex> does specialize >> complex for float and double. Not just as an external, but the actual >> specialization is in the header. Which prevents us from doing our own. >> >> Moreover, comments refer to what seem to be sections in the C++ spec, >> so this is probably standard. > > I see, that's a pity. > >> So at this point I want to say: OK so the c++ standard expects the std >> lib implementors to fully take care of optimizing complex for float >> and double, fine! So the next thing to do is to check the assembly >> with some test program: enable SSE2 and check that the compiler really >> produces vectorized code. An addition of complex<double> should be >> vectorized for example. Looking at the <complex> header, there doesn't >> seem to be any explicit vectorization, but perhaps they know that the >> compiler does a good job here, or perhaps i don't understand what >> these __real__, __imag__ pseudo keywords mean. > > I've just checked, and g++ does not vectorize at all complex<double> :( > > > gael. > > >> >> Cheers, >> Benoit >> >> >> > yes of course we first need to get the vectorization of complex done the >> > right way, and I still don't know what's the best option. We can either >> > provide our own specialization of std::complex<double> implementation or >> > use >> > the ei_p* functions: >> > >> > Own specialization of std::complex<double> >> > * pro: >> > - enable its optimization everywhere >> > * cons: >> > - we and the user have to make sure a std::complex<double> is always >> > correctly aligned (easy to do unless the user starts doing fancy casts) >> > - on the long term, the next gen CPUs will have 256bit registers, so >> > a >> > packet of std::complex<double> won't be made of a single coeff >> > anymore... >> > >> > Using the ei_p* with packet of 1 coeff >> > * pro: >> > - no need to specialize std::complex<double> >> > * cons: >> > - need to enable the vectorization path (to use the ei_p*) even for >> > packets of one element, so we need another mechanism to specify whether >> > a >> > scalar type is vectorizable or not >> > - usually, the vectorized path does not really make sens for packet >> > size >> > of 1 coeff, (on the other hand, the products are implemented on this >> > model, >> > so why not) >> > >> > all in all, currently I don't see any argument which would strongly kill >> > one >> > option in favor of the other, and so I would vote for the first one (own >> > specialization of std::complex<double>) simply because it enables more >> > optimizations and it is probably easier to do. >> > >> > gael. >> > >> > >> >> >> >> Cheers, >> >> Benoit >> >> >> >> 2008/12/2 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>: >> >> > >> >> > all of this looks great, >> >> > >> >> > On Mon, Dec 1, 2008 at 11:31 PM, Benoit Jacob >> >> > <jacob.benoit.1@xxxxxxxxx> >> >> > wrote: >> >> >> >> >> >> 2008/12/1 Tim Molteno <tim@xxxxxxxxxxxxxxxxxxx>: >> >> >> >> >> >> for (unsigned i=0; i<N2; i++) >> >> >> { >> >> >> std::complex<T> temp(data[i+N2]*w); >> >> >> data[i+N2] = data[i]-temp; >> >> >> data[i] += temp; >> >> >> w += w*wp; >> >> >> } >> >> >> >> >> >> And I don't see any way to use Eigen expressions here, or to >> >> >> vectorize >> >> >> that loop as such (again, no contiguous access). >> >> > >> >> > what about the following (let's assume the size of a packet is 2): >> >> > >> >> > const int PacketSize = 2; >> >> > Scalar _w[PacketSize]; >> >> > _w[0] = 0; >> >> > for (int k=1; k<PacketSize; ++k) >> >> > _w[k] = _w[k-1] + _w[k-1] * wp; >> >> > Packet w = ei_pload(_w); >> >> > Packet wp2 = ei_pset1(wp); >> >> > for (int i=0; i<N2; i+=PacketSize) >> >> > { >> >> > Packet tmp = ei_pmul(ei_pload(data+i+N2), w); >> >> > Packet di = ei_pload(data+i); >> >> > ei_store(data+i+N2, ei_psub(di, tmp)); >> >> > ei_store(data+i, ei_padd(di, tmp)); >> >> > w = ei_madd(w, wp2, w); >> >> > } >> >> > >> >> > I think this should work whatever PacketSize is (including the case >> >> > when >> >> > the >> >> > vectorization is disabled, i.e., PacketSize = 1) >> >> > >> >> > gael. >> >> > >> >> >> >> --- >> >> >> > >> > >> >> --- >> > > ---

**Follow-Ups**:**Re: [eigen] A complex FFT for Eigen***From:*Gael Guennebaud

**References**:**Re: [eigen] A complex FFT for Eigen***From:*Benoit Jacob

**Re: [eigen] A complex FFT for Eigen***From:*Gael Guennebaud

**Re: [eigen] A complex FFT for Eigen***From:*Benoit Jacob

**Re: [eigen] A complex FFT for Eigen***From:*Gael Guennebaud

**Re: [eigen] A complex FFT for Eigen***From:*Benoit Jacob

**Re: [eigen] A complex FFT for Eigen***From:*Gael Guennebaud

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] A complex FFT for Eigen** - Next by Date:
**Re: [eigen] A complex FFT for Eigen** - Previous by thread:
**Re: [eigen] A complex FFT for Eigen** - Next by thread:
**Re: [eigen] A complex FFT for Eigen**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |