Re: [eigen] Quaternion and expression template

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

I think I have a beginning of an answer for the bad inlining with VS 2008 (and VS 2010 beta2). I have a little class :

class Test{
public:
  double data[2];

  inline Test() {data[0] = 0; data[1] = 0;}
  inline Test(double x, double y) {data[0] = x; data[1] = y;}

  inline Test add42(){
    return Test(data[0]+42, data[1]+42);
  }

  inline ~Test(){}


  void print(){cout << data[0] << " : " << data[1] << endl;}
};


__declspec(noinline) void unWin2()
{
  Test t;
  Test t2 = t.add42();

    __asm{
    nop
    nop
    nop
  }


  t.print();
  t2.print();

  return;
}

The generated assembly associated to Test t2 = t.add42() is :

004010A3  lea         eax,[esp+10h]
004010A7  lea         ecx,[esp]
004010AA  call        Test::add42 (401080h)

	Test::add42
00401080  fld         qword ptr [ecx]
00401082  fld         qword ptr [__real@4045000000000000 (402138h)]
00401088  fadd        st(1),st
0040108A  fxch        st(1)
0040108C  fstp        qword ptr [eax]
0040108E  fadd        qword ptr [ecx+8]
00401091  fstp        qword ptr [eax+8]
00401094  ret

using __forceinline (EIGEN_STRONG_INLINE) does not improve the generated assembly. I have also done this trial with the default constructor and copy assignement and with my own copy constructor and copy assignement operator, there are no differences.

This code can be inlined correcty in two ways :

   * disabling exception handling (removing /EHsc)
or * removing the desctructor in Test (inline ~Test(){};)

which give, in both case :

00401083  fld         qword ptr [esp]
00401086  fld         st(0)
00401088  fld         qword ptr [__real@4045000000000000 (402138h)]
0040108E  fadd        st(1),st
00401090  fxch        st(1)
00401092  fstp        qword ptr [esp]
00401095  fld         qword ptr [esp+8]
00401099  fld         st(0)
0040109B  faddp       st(2),st
0040109D  fxch        st(1)
0040109F  fstp        qword ptr [esp+8]

I don't understand the logic behind this behavior. The problem is exactly the same for the Quaternion class, if the destructor

	 inline ~Matrix(){} (line 529)

is removed from Matrix.h all function returning a Quaternion by value are correclty inlined (such as operator*(), conjugate(), etc.)

--
Mathieu




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/