[eigen] Use of streaming loads in eigen

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


In the PacketMath.h file, the only eigen function to send the sse
registers back to memory is the _mm_store_ intrinsic. Has anybody
looked at using the _mm_stream intrinsics instead. They should help
atleast in BLAS 1 and BLAS 2 codes where the code is fundamentally
bandwidth bound. For a pseudo benchmark. look here

http://humus.name/index.php?page=Comments&ID=244

Even for BLAS 3 it should only help. However, for small sized stuff,
it will definitely not be helpful as one will want to have the item in
the cache as long as possible as it doesn't pollute the cache because
of it's small size.

-- 
Rohit Garg

http://rpg-314.blogspot.com/

Senior Undergraduate
Department of Physics
Indian Institute of Technology
Bombay



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/