[eigen] need help with assembly

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

since today the Product by default evaluates immediately when nested into a 
bigger expression. We found that seems to improve performance. In other 
words, an expression like

	m + m*m

is now equivalent to

	m + (m*m).eval()

That was the theory; in practice, the second form still gives 10% better 
performance than the first one. I have tried the following two benchmarks 
(see benchmark.cpp):

1)

Matrix3d m;
....
for(int a = 0; a < REPEAT; a++)
{
  m = Matrix3d::ones() + 0.00005 * (m + m*m);
}

Assembly output: see attached file "b.s"


2)

Matrix3d m;
....
for(int a = 0; a < REPEAT; a++)
{
  m = Matrix3d::ones() + 0.00005 * (m + (m*m).eval());
}

Assembly output: see attached file "b-eval.s"


I don't see much difference between the two files! In b-eval the 3 "movl" 
instructions are grouped together, while in b.s they are scattered over the 
whole loop. Can that alone explain the 10% speed difference?

Cheers,

Benoit

Attachment: b-eval.s
Description: Binary data

Attachment: b.s
Description: Binary data

Attachment: signature.asc
Description: This is a digitally signed message part.



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/