Sure, compatibility is one of the many reasons why the default will remain the current, standard "interleaved" scheme which is really the natural one. There are other fundamental reasons, for example, a default storage scheme simply can't be made vectorization-dependent as that would break our rule that enabling/disabling vectorization doesn't break the API. Anyway. Yes, the main reason why i believed that the other storage scheme could be faster, was that multiplication was going to be faster. Anyone is welcome to study the other storage scheme in a personal branch but i'd be (pleasantly) surprised if this could be done without very big (not worth it) changes in Eigen.... Benoit 2009/5/19 Rohit Garg <rpg.314@xxxxxxxxx>: > This is probably not a good idea. I believe that they should be stored > in the interleaved format. I'll be happy to pitch in with SSE2/3 > intrinsics code for complex multiplication, division if neccessary. I > think we should go the standard way as many other libraries and > std::complex use it. > > So far, on this discussion, the only reason for not using the > interleaved format that I have seen is that it is tricky to multiply > using that. Is there any other reason? > > IMHO, we shouldn't lose compatibility with ~90% of other complex > libraries/formats just to simplify multiplication. > > On Tue, May 19, 2009 at 5:49 AM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote: >> I can believe that this is probably a very efficient storage scheme. >> We could offer this as an option if really it's not too hard to >> implement (i didn't start thinking about this). >> >> The default should remain the current for many reasons, but as an >> option why not. >> >> Cheers, >> Benoit >> >> >> 2009/5/19 Márton Danóczy <marton78@xxxxxxxxx>: >>>> I concur: I don't think that it would be very useful to have complex >>>> matrices with the real and imaginary parts stored separately. Most >>>> operations -- and the more costly ones -- would run slower in such a >>>> scheme. The basic issue here is memory locality. >>> >>> What about storing them packet by packet? That is, in case of floats, >>> four real parts followed by four imaginary parts. That would not be >>> too hard to implement and vectorization of component-wise operations >>> would be trivial. And I think even FFTW can handle that using the guru >>> interface, by setting up a split fft plan with a stride of >>> 2*packetsize. >>> >>> Marton >>> >>> >>> >> >> >> > > > > -- > Rohit Garg > > http://rpg-314.blogspot.com/ > > Senior Undergraduate > Department of Physics > Indian Institute of Technology > Bombay > > >

