|Re: [eigen] Array of complex numbers|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen <eigen@xxxxxxxxxxxxxxxxxxx>
- Subject: Re: [eigen] Array of complex numbers
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Thu, 10 Jan 2019 22:25:58 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=OBtt8dj7hKlo9RJeJbBGGpkaMf1CMw27ksA5jPsdpGA=; b=CHxvl3pMKK8iR0z1ei221BeImO3qmPp2BsXhuvc2YZ0MmL0XrxJRFbvJ9c11Knb/ih LRMqNVXL13q8o43TU074Xg/wvAhqNZPMtHIppvURbPdqrU+K2/SR3KsvtJH8sy4CVhLT SdcDjlMy7VUn2JzRXA5OYwwpf5PNxQetgHh8rkOHbehqMHYs7p59XwNxaG8Mbzg1go4C PGcs4QcrEEEnMlnTFOpzMNfdTpbY7rzGEtCeC91YWqhW8OUSxnHP3G7aqkIda04ObOyz YwZdiyRLP6Z3UDk6VrSky/5r61G3EbYTyAJyHNJ92SOmy38e59bUPxYCDqvGuKbqBuc/ B6bg==
just to be sure I fully understood the operations you're considering, let's assume you have:
VectorXcf A1(300), A2(300), ...;
complex<float> c1, c2, ...;
float r1, r2, ...;
then "Scalar product" means:
and "Linear combination" means:
r1*A1 + r2*A2 + ....
If this is right, then linear combination with real coefficients should already be 100% optimal and the only problematic operation could be dot products because of the complex*complex product overhead. I think those can be optimized while keeping an AoS layout by accumulating within multiple packets that we combine only at the end of the reduction instead of doing and expensive combination for every product. This is similar to what we do in GEMM.
I gave it a shot, and I get similar speed than with a SoA layout. See attached file. This is a proof of concept : AVX only, no conjugation, n%16==0, etc.
I think this can be integrated within Eigen by generalizing a bit the current reduction code to allow for custom accumulation packets.
Description: Binary data