Re: [eigen] Quaternion and expression template |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Quaternion and expression template
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Mon, 30 Nov 2009 10:38:25 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=jhFuNunXQsysFF82xxDjfuVRc+tUx3U6JHRjoqn+9C8=; b=BfKd4mLHvG+cFmJZ8jH/faf+X/K4SY69IJcC8GfreySphu8b2WdLFzbW8hv/AmtTPC tgDAxHTQhiNNR2jW65BJQFC2GBZHbJGS4EFHiUI/HcVztNW4hnYPdlUYfx9R59izhqL1 uWdrIYBqbHNBssyWDk9p3ywlrv5PAa2bKleW0=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=gJrUFDQXHIiIfEHHKl07fjJpJtgqAJsrG4LZ69jhJnrZMi+byKN1/rj77oY+4TrgCu NOI9KL27nv6WfES7fj+HpQXHwrVJDv79yx3y+xRNCUOJa9hcOrqBwarxgoMa7PywiyPl wSNbFKzH6JDkbNfCdUeg7gXIjAAJcXXTp1/TU=
2009/11/30 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
> Great that you found that. Looking at this W4 description
>
> http://msdn.microsoft.com/en-us/library/a98sb923%28VS.80%29.aspx
>
> you will see that exception handling at particular cases can be a cause for
> not inlining. Actually, according to what is written on the link site, there
> are no heuristics involved in ignoring inlining in the presence of
> __forceinline, i.e. we should be able to identify and even probably fix
> those cases. Unfortunately the compiler is not helping us and the logic (the
> non-heuristic part) is not really explained on msdn.com...
>
> Maybe, we could learn from your tests that we should avoid implementing
> destructors ourselves when the compiler can generate them.
I'm a bit afraid of infering such rules from experiments, because for
example, i remember adding a default ctor to SVD in the 2.0 branch
because for some user, MSVC failed to generate a default constructor
automatically. Just saying that it's scary because compilers can have
very erratic behavior.
How about instead saying that all that is a good reason to actually do
xpr templates for Quaternion? After Hauke and Gael merge their
branches, the small performance argument should be gone. Also, Gael's
idea for reusable Xpr classes should apply there too, allowing to do
that without writing any new xpr class.
Benoit
>
> - Hauke
>
> On Mon, Nov 30, 2009 at 4:04 PM, Mathieu Gautier <mathieu.gautier@xxxxxx>
> wrote:
>>
>> Hi,
>>
>> I think I have a beginning of an answer for the bad inlining with VS 2008
>> (and VS 2010 beta2). I have a little class :
>>
>> class Test{
>> public:
>> double data[2];
>>
>> inline Test() {data[0] = 0; data[1] = 0;}
>> inline Test(double x, double y) {data[0] = x; data[1] = y;}
>>
>> inline Test add42(){
>> return Test(data[0]+42, data[1]+42);
>> }
>>
>> inline ~Test(){}
>>
>>
>> void print(){cout << data[0] << " : " << data[1] << endl;}
>> };
>>
>>
>> __declspec(noinline) void unWin2()
>> {
>> Test t;
>> Test t2 = t.add42();
>>
>> __asm{
>> nop
>> nop
>> nop
>> }
>>
>>
>> t.print();
>> t2.print();
>>
>> return;
>> }
>>
>> The generated assembly associated to Test t2 = t.add42() is :
>>
>> 004010A3 lea eax,[esp+10h]
>> 004010A7 lea ecx,[esp]
>> 004010AA call Test::add42 (401080h)
>>
>> Test::add42
>> 00401080 fld qword ptr [ecx]
>> 00401082 fld qword ptr [__real@4045000000000000 (402138h)]
>> 00401088 fadd st(1),st
>> 0040108A fxch st(1)
>> 0040108C fstp qword ptr [eax]
>> 0040108E fadd qword ptr [ecx+8]
>> 00401091 fstp qword ptr [eax+8]
>> 00401094 ret
>>
>> using __forceinline (EIGEN_STRONG_INLINE) does not improve the generated
>> assembly. I have also done this trial with the default constructor and copy
>> assignement and with my own copy constructor and copy assignement operator,
>> there are no differences.
>>
>> This code can be inlined correcty in two ways :
>>
>> * disabling exception handling (removing /EHsc)
>> or * removing the desctructor in Test (inline ~Test(){};)
>>
>> which give, in both case :
>>
>> 00401083 fld qword ptr [esp]
>> 00401086 fld st(0)
>> 00401088 fld qword ptr [__real@4045000000000000 (402138h)]
>> 0040108E fadd st(1),st
>> 00401090 fxch st(1)
>> 00401092 fstp qword ptr [esp]
>> 00401095 fld qword ptr [esp+8]
>> 00401099 fld st(0)
>> 0040109B faddp st(2),st
>> 0040109D fxch st(1)
>> 0040109F fstp qword ptr [esp+8]
>>
>> I don't understand the logic behind this behavior. The problem is exactly
>> the same for the Quaternion class, if the destructor
>>
>> inline ~Matrix(){} (line 529)
>>
>> is removed from Matrix.h all function returning a Quaternion by value are
>> correclty inlined (such as operator*(), conjugate(), etc.)
>>
>> --
>> Mathieu
>>
>>
>>
>
>