[eigen] optimization question

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: [eigen] optimization question
From: Michel <michel.pacilli@xxxxxxx>
Date: Tue, 04 Oct 2011 02:32:22 +0200

Hi,

not sure if it's the good place to ask user question... tell me if so.

Well I try to get the best of eigen simple example, and I'm not sure that I get the most :

#define N 32768

Matrix<float,N,1> u;

Matrix<float,N,1> v;

Matrix<float,N,1> w;


for(int k=0; k <NLOOP; ++k)

   u = v.array() * w.array();

compile with gcc and sse2 flag

Well, compare to a simple for loop and aligned array, I've got around 17% speed up with eigen ;)
but, is it possible to give at compile time some hints to go further, with unrolling, sse3,4? or other things?

the asm of product is:

 # 86 "..\eigen\main.cpp" 1
	#it begins here!
 # 0 "" 2
/NO_APP
	xorl	%eax, %eax
	.p2align 4,,10
L3:
	movaps	(%esi,%eax,4), %xmm0
	mulps	(%ebx,%eax,4), %xmm0
	movaps	%xmm0, (%edx,%eax,4)
	addl	$4, %eax
	cmpl	$32768, %eax
	jne	L3
/APP
 # 88 "..\eigen\main.cpp" 1
	#it ends here!

I wonder if it could be more efficient with more than just one xmm reg, or prefetch ?

with my best regards for this great work,

michel pacilli

Follow-Ups:
- Re: [eigen] optimization question
  - From: Benoit Jacob

References:
- AW: [eigen] New release?
  - From: Schmidt, Michael

Messages sorted by: [ date | thread ]
Prev by Date: AW: [eigen] New release?
Next by Date: Re: [eigen] optimization question
Previous by thread: AW: [eigen] New release?
Next by thread: Re: [eigen] optimization question

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/