On Mon, Nov 30, 2009 at 4:04 PM, Mathieu Gautier
<mathieu.gautier@xxxxxx> wrote:
Hi,
I think I have a beginning of an answer for the bad inlining with VS 2008 (and VS 2010 beta2). I have a little class :
class Test{
public:
double data[2];
inline Test() {data[0] = 0; data[1] = 0;}
inline Test(double x, double y) {data[0] = x; data[1] = y;}
inline Test add42(){
return Test(data[0]+42, data[1]+42);
}
inline ~Test(){}
void print(){cout << data[0] << " : " << data[1] << endl;}
};
__declspec(noinline) void unWin2()
{
Test t;
Test t2 = t.add42();
__asm{
nop
nop
nop
}
t.print();
t2.print();
return;
}
The generated assembly associated to Test t2 = t.add42() is :
004010A3 lea eax,[esp+10h]
004010A7 lea ecx,[esp]
004010AA call Test::add42 (401080h)
Test::add42
00401080 fld qword ptr [ecx]
00401082 fld qword ptr [__real@4045000000000000 (402138h)]
00401088 fadd st(1),st
0040108A fxch st(1)
0040108C fstp qword ptr [eax]
0040108E fadd qword ptr [ecx+8]
00401091 fstp qword ptr [eax+8]
00401094 ret
using __forceinline (EIGEN_STRONG_INLINE) does not improve the generated assembly. I have also done this trial with the default constructor and copy assignement and with my own copy constructor and copy assignement operator, there are no differences.
This code can be inlined correcty in two ways :
* disabling exception handling (removing /EHsc)
or * removing the desctructor in Test (inline ~Test(){};)
which give, in both case :
00401083 fld qword ptr [esp]
00401086 fld st(0)
00401088 fld qword ptr [__real@4045000000000000 (402138h)]
0040108E fadd st(1),st
00401090 fxch st(1)
00401092 fstp qword ptr [esp]
00401095 fld qword ptr [esp+8]
00401099 fld st(0)
0040109B faddp st(2),st
0040109D fxch st(1)
0040109F fstp qword ptr [esp+8]
I don't understand the logic behind this behavior. The problem is exactly the same for the Quaternion class, if the destructor
inline ~Matrix(){} (line 529)
is removed from Matrix.h all function returning a Quaternion by value are correclty inlined (such as operator*(), conjugate(), etc.)
--
Mathieu