Re: [eigen] Re: recent improvements in the products |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
yes this is something like that but that's a bit more sophisticated especially because the blocks have to be appropriately copied into a packed fashion so that the product kernel can work with optimal performance.
To be honest I've followed Goto's paper:
http://www.cs.utexas.edu/users/flame/pubs/GotoTOMS2.pdf
with some improvements to improvement to optimize code reuse.
and the details of the kernel are here:
http://www.cs.utexas.edu/users/flame/pubs/GotoTOMS_final.pdf
cheers,
Gael.
On Tue, Jul 28, 2009 at 8:49 PM, Jitse Niesen <jitse@xxxxxxxxxxxxxxxxx> wrote:On Tue, 28 Jul 2009, Gael Guennebaud wrote:With "blocking algorithm", do you mean that you divide the matrices up in blocks to reduce cache misses? For instance, to compute the product of two N-by-N matrices, partition them as an (N/n)-by-(N/n) block matrices of n-by-n blocks, and multiply the blocks. I had a quick look at the code and that does not seem what you're doing ...
* all these new routines are just high level blocking algorithms built on
top of a single highly optimized product kernel.
Cheers,
Jitse
--
Gaël Guennebaud
Iparla - INRIA Bordeaux
(+33)5 40 00 37 95
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |