[eigen] optimization question |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen@xxxxxxxxxxxxxxxxxxx*Subject*: [eigen] optimization question*From*: Michel <michel.pacilli@xxxxxxx>*Date*: Tue, 04 Oct 2011 02:32:22 +0200

Hi, not sure if it's the good place to ask user question... tell me if so. Well I try to get the best of eigen simple example, and I'm not sure that I get the most : #define N 32768 Matrix<float,N,1> u; Matrix<float,N,1> v; Matrix<float,N,1> w; for(int k=0; k <NLOOP; ++k) u = v.array() * w.array(); compile with gcc and sse2 flag Well, compare to a simple for loop and aligned array, I've got around 17% speed up with eigen ;) but, is it possible to give at compile time some hints to go further, with unrolling, sse3,4? or other things? the asm of product is: # 86 "..\eigen\main.cpp" 1 #it begins here! # 0 "" 2 /NO_APP xorl %eax, %eax .p2align 4,,10 L3: movaps (%esi,%eax,4), %xmm0 mulps (%ebx,%eax,4), %xmm0 movaps %xmm0, (%edx,%eax,4) addl $4, %eax cmpl $32768, %eax jne L3 /APP # 88 "..\eigen\main.cpp" 1 #it ends here! I wonder if it could be more efficient with more than just one xmm reg, or prefetch ? with my best regards for this great work, michel pacilli |

**Follow-Ups**:**Re: [eigen] optimization question***From:*Benoit Jacob

**References**:**AW: [eigen] New release?***From:*Schmidt, Michael

**Messages sorted by:**[ date | thread ]- Prev by Date:
**AW: [eigen] New release?** - Next by Date:
**Re: [eigen] optimization question** - Previous by thread:
**AW: [eigen] New release?** - Next by thread:
**Re: [eigen] optimization question**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |