Re: [eigen] A complex FFT for Eigen |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: Re: [eigen] A complex FFT for Eigen*From*: "Gael Guennebaud" <gael.guennebaud@xxxxxxxxx>*Date*: Tue, 2 Dec 2008 15:28:23 +0100*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=CMjNdNBqrJUA9/nwO8NQWvpov4+fPwLThnplqEbFKOE=; b=TvtQsoVtxHS8SsTAOYBieajOBHdfNp3cMSy+R2kCOWRWhPsKB3q82xTPOv+V2lw9Iv UicT+fnnJ5tnZ22PGUAb9uHNM1FpptnxNM0XnXXomwYNefcZaGUB7ugecfBf+6t8KAW0 GJNfmP+DQsbcrrdeZ1V8kKmcHW2gSaBqynQsI=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=hlfHCIDiyakEM1yD9Ty0yaJKqOXl0YxEZt/zjOHvnV9az3YGHJOqZDuRV5Ll1Q4z2E 4ged0V6qVE0cEIXSki82jbcaphTSUubaRsa8IpeOJInm1g/h5ppDHSKGE75ywHreKSCq a76OWslHOaN5CsEdYUoZ4bOx5Hu7+oWAYL8fs=

On Tue, Dec 2, 2008 at 2:10 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:

thanks for the fixes !

yes of course we first need to get the vectorization of complex done the right way, and I still don't know what's the best option. We can either provide our own specialization of std::complex<double> implementation or use the ei_p* functions:

Own specialization of std::complex<double>

* pro:

- enable its optimization everywhere

* cons:

- we and the user have to make sure a std::complex<double> is always correctly aligned (easy to do unless the user starts doing fancy casts)

- on the long term, the next gen CPUs will have 256bit registers, so a packet of std::complex<double> won't be made of a single coeff anymore...

Using the ei_p* with packet of 1 coeff

* pro:

- no need to specialize std::complex<double>

* cons:

- need to enable the vectorization path (to use the ei_p*) even for packets of one element, so we need another mechanism to specify whether a scalar type is vectorizable or not

- usually, the vectorized path does not really make sens for packet size of 1 coeff, (on the other hand, the products are implemented on this model, so why not)

all in all, currently I don't see any argument which would strongly kill one option in favor of the other, and so I would vote for the first one (own specialization of std::complex<double>) simply because it enables more optimizations and it is probably easier to do.

gael.

Arf good job! One typo, it's ei_pstore instead of ei_store, and

ei_pmadd instead of ei_madd (just in case Tim tries this code).

thanks for the fixes !

One thing though, is it premature optimization to write it in such a

lowlevel way even before we have global vectorization support for

complex numbers... I mean, your code is perfect for complex<float> but

leaves complex<double> outside in the cold. Maybe it'd be easier to

get a solution that works for both once we have global vectorization

support for complex. Maybe the first thing to do would be to add

specialisations of PacketMath.h for complex types, so we don't even

need to have our own complex class.

yes of course we first need to get the vectorization of complex done the right way, and I still don't know what's the best option. We can either provide our own specialization of std::complex<double> implementation or use the ei_p* functions:

Own specialization of std::complex<double>

* pro:

- enable its optimization everywhere

* cons:

- we and the user have to make sure a std::complex<double> is always correctly aligned (easy to do unless the user starts doing fancy casts)

- on the long term, the next gen CPUs will have 256bit registers, so a packet of std::complex<double> won't be made of a single coeff anymore...

Using the ei_p* with packet of 1 coeff

* pro:

- no need to specialize std::complex<double>

* cons:

- need to enable the vectorization path (to use the ei_p*) even for packets of one element, so we need another mechanism to specify whether a scalar type is vectorizable or not

- usually, the vectorized path does not really make sens for packet size of 1 coeff, (on the other hand, the products are implemented on this model, so why not)

all in all, currently I don't see any argument which would strongly kill one option in favor of the other, and so I would vote for the first one (own specialization of std::complex<double>) simply because it enables more optimizations and it is probably easier to do.

gael.

Cheers,

Benoit

2008/12/2 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:

--->

> all of this looks great,

>

> On Mon, Dec 1, 2008 at 11:31 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>

> wrote:

>>

>> 2008/12/1 Tim Molteno <tim@xxxxxxxxxxxxxxxxxxx>:

>>

>> for (unsigned i=0; i<N2; i++)

>> {

>> std::complex<T> temp(data[i+N2]*w);

>> data[i+N2] = data[i]-temp;

>> data[i] += temp;

>> w += w*wp;

>> }

>>

>> And I don't see any way to use Eigen expressions here, or to vectorize

>> that loop as such (again, no contiguous access).

>

> what about the following (let's assume the size of a packet is 2):

>

> const int PacketSize = 2;

> Scalar _w[PacketSize];

> _w[0] = 0;

> for (int k=1; k<PacketSize; ++k)

> _w[k] = _w[k-1] + _w[k-1] * wp;

> Packet w = ei_pload(_w);

> Packet wp2 = ei_pset1(wp);

> for (int i=0; i<N2; i+=PacketSize)

> {

> Packet tmp = ei_pmul(ei_pload(data+i+N2), w);

> Packet di = ei_pload(data+i);

> ei_store(data+i+N2, ei_psub(di, tmp));

> ei_store(data+i, ei_padd(di, tmp));

> w = ei_madd(w, wp2, w);

> }

>

> I think this should work whatever PacketSize is (including the case when the

> vectorization is disabled, i.e., PacketSize = 1)

>

> gael.

>

**Follow-Ups**:**Re: [eigen] A complex FFT for Eigen***From:*Benoit Jacob

**References**:**Re: [eigen] A complex FFT for Eigen***From:*Benoit Jacob

**Re: [eigen] A complex FFT for Eigen***From:*Gael Guennebaud

**Re: [eigen] A complex FFT for Eigen***From:*Benoit Jacob

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] A complex FFT for Eigen** - Next by Date:
**Re: [eigen] A complex FFT for Eigen** - Previous by thread:
**Re: [eigen] A complex FFT for Eigen** - Next by thread:
**Re: [eigen] A complex FFT for Eigen**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |