Re: [eigen] FFT for Eigen

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Sure, compatibility is one of the many reasons why the default will
remain the current, standard "interleaved" scheme which is really the
natural one. There are other fundamental reasons, for example, a
default storage scheme simply can't be made vectorization-dependent as
that would break our rule that enabling/disabling vectorization
doesn't break the API.

Anyway. Yes, the main reason why i believed that the other storage
scheme could be faster, was that multiplication was going to be
faster.

Anyone is welcome to study the other storage scheme in a personal
branch but i'd be (pleasantly) surprised if this could be done without
very big (not worth it) changes in Eigen....

Benoit

2009/5/19 Rohit Garg <rpg.314@xxxxxxxxx>:
> This is probably not a good idea. I believe that they should be stored
> in the interleaved format. I'll be happy to pitch in with SSE2/3
> intrinsics code for complex multiplication, division if neccessary. I
> think we should go the standard way as many other libraries and
> std::complex use it.
>
> So far, on this discussion, the only reason for not using the
> interleaved format that I have seen is that it is tricky to multiply
> using that. Is there any other reason?
>
> IMHO, we shouldn't lose compatibility with ~90% of other complex
> libraries/formats just to simplify multiplication.
>
> On Tue, May 19, 2009 at 5:49 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> I can believe that this is probably a very efficient storage scheme.
>> We could offer this as an option if really it's not too hard to
>> implement (i didn't start thinking about this).
>>
>> The default should remain the current for many reasons, but as an
>> option why not.
>>
>> Cheers,
>> Benoit
>>
>>
>> 2009/5/19 Márton Danóczy <marton78@xxxxxxxxx>:
>>>> I concur: I don't think that it would be very useful to have complex
>>>> matrices with the real and imaginary parts stored separately. Most
>>>> operations -- and the more costly ones -- would run slower in such a
>>>> scheme. The basic issue here is memory locality.
>>>
>>> What about storing them packet by packet? That is, in case of floats,
>>> four real parts followed by four imaginary parts. That would not be
>>> too hard to implement and vectorization of component-wise operations
>>> would be trivial. And I think even FFTW can handle that using the guru
>>> interface, by setting up a split fft plan with a stride of
>>> 2*packetsize.
>>>
>>> Marton
>>>
>>>
>>>
>>
>>
>>
>
>
>
> --
> Rohit Garg
>
> http://rpg-314.blogspot.com/
>
> Senior Undergraduate
> Department of Physics
> Indian Institute of Technology
> Bombay
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/