Re: [eigen] A complex FFT for Eigen |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] A complex FFT for Eigen
- From: "Benoit Jacob" <jacob.benoit.1@xxxxxxxxx>
- Date: Mon, 1 Dec 2008 23:31:55 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=etLb3nN3EdXwL0ZNRG3+I8EAj51cbA9WAM+Jf95laP8=; b=fqqpBivR7sqsW0WIL7gjbESA4w5z1xK+n+7W1+e6LErTPQD1dKBmGJx7N/o7Q0XVCN WXHenC3/OZNXHMqb02Ps30pWBYw5Ghf2Pf/qLb/4PS7t69f2U9/r7kUaRSQ4Zpa143OU e8R73kRbP+UJ4ZHhUDh83JkcVSyuKJWwqdZLM=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=FYZcwG3iKoOCW8Slc7oI/8fgmySIK6GJU4YVmVf+GWbhhK4+b5mLWLIRsV99clBQ+y VbveptAXWONP1nadRcy7FxoaRDH3+OWcla9L0Sg/9J/S3Y8fNfczlUtk9Z/Wc0eqLJR1 RHu90feyv62svDuJ93ikuMPAjosBZY6y34xb0=
2008/12/1 Tim Molteno <tim@xxxxxxxxxxxxxxxxxxx>:
>> - replace for loops by Eigen expressions as much as possible
>>
>
> Agree that this is a useful goal, but I see this as a longer-term development.
> Once the FFT and testharnesses are solidly there, then this is best done
> hand-in-hand with benchmarking tools. There are also many many people out
> there who know far more than I about vectorizing FFT code.
Nevermind what I said... I looked back at the for loop,
for (unsigned i=0; i<N2; i++)
{
std::complex<T> temp(data[i+N2]*w);
data[i+N2] = data[i]-temp;
data[i] += temp;
w += w*wp;
}
And I don't see any way to use Eigen expressions here, or to vectorize
that loop as such (again, no contiguous access).
The good thing is that that's one less thing to require before
inclusion in Eigen... the bad thing is that then I have no clue how
FFTW achieves better performance.
As for the other loop, in factorize(), we already discussed that it
can't be vectorized.
One thing perhaps to consider... if one expands by hand the case for
N=4 or N=8 then perhaps one can notice some reordering of operations
allowing to vectorize things or at least reduce the number of
load/store.
> I don't think that all the 3 goals
2 remain... sorry for the noise.
> will be met for quite some time --
> particularly the dynamic sizes and the fftw backend. However, I'll start over
> the next few days putting together a module, using the LU code as a model
Great, looking forward to it.
Cheers,
Benoit
---