On Mon, Dec 1, 2008 at 11:31 PM, Benoit Jacob
<jacob.benoit.1@xxxxxxxxx> wrote:
2008/12/1 Tim Molteno <tim@xxxxxxxxxxxxxxxxxxx>:
for (unsigned i=0; i<N2; i++)
{
std::complex<T> temp(data[i+N2]*w);
data[i+N2] = data[i]-temp;
data[i] += temp;
w += w*wp;
}
And I don't see any way to use Eigen expressions here, or to vectorize
that loop as such (again, no contiguous access).
what about the following (let's assume the size of a packet is 2):
const int PacketSize = 2;
Scalar _w[PacketSize];
_w[0] = 0;
for (int k=1; k<PacketSize; ++k)
_w[k] = _w[k-1] + _w[k-1] * wp;
Packet w = ei_pload(_w);
Packet wp2 = ei_pset1(wp);
for (int i=0; i<N2; i+=PacketSize)
{
Packet tmp = ei_pmul(ei_pload(data+i+N2), w);
Packet di = ei_pload(data+i);
ei_store(data+i+N2, ei_psub(di, tmp));
ei_store(data+i, ei_padd(di, tmp));
w = ei_madd(w, wp2, w);
}
I think this should work whatever PacketSize is (including the case when the vectorization is disabled, i.e., PacketSize = 1)
gael.