Re: [eigen] Vectorization of complex

[ Thread Index | Date Index | More Archives ]

On Fri, Jan 21, 2011 at 1:28 PM, Christoph Hertzberg
<chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> On 21.01.2011 13:02, Gael Guennebaud wrote:
>> note that our matrix-matrix product kernel for complexes does not use
>> this pmul function which is rather slow. The trick is to split the
>> products between the real and imaginary part and combine them at the
>> end of a series of mul-add.
> BTW: Have you tried the "3-multiplication-trick" for complex matrix
> multiplication yet:
> (A+iB)*(C+iD) = AC - BD + i[(A+B)*(C+D) - AC - BD]
> For big enough matrices this could give almost 25% performance gain --
> at cost of little precision loss (could actually be quite large, e.g. if
> the imaginary part is much smaller than the real part).

as you say, this trick is numerically not accurate to be used in practice.

>> Well this pmul function is actually used N^2 times for the
>> multiplication with alpha. Recall that our kernel computes C += alpha
>> * A * B, and even if you only do C = A*B this product with alpha is
>> still there, taking alpha = 1.
> Why? I admit this might be necessary in non-template libraries to
> reduce/avoid code duplication, but I would have assumed that this can be
> avoided by template specializations somehow (I have never checked your
> multiplication-kernel though ...)

The code is very heavy, and so we don't want to have to instantiate
the almost same code twice for a very small performance gain. Also
note that that code is for dynamic size matrices. For small fixed size
ones, the path is different without such an implicit multiplicative


Mail converted by MHonArc 2.6.19+