Re: [eigen] Matrix product crashes when compiled with MSVC 2010 in release |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Matrix product crashes when compiled with MSVC 2010 in release
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Sat, 14 Aug 2010 10:47:36 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=W18td5gjxMtOD1fsFX8saZWRKT0w+SyCeTxgtg86ddg=; b=SYThVQZF50Y2JlTC0VKvi15esMlPbBwtvh3NuQpyZY6fJ8X9GIEy3CsskmYcOQdPoh nt7Hd/vcJHsSDnPv7A3XazxpfXOVPkxf4ZyLni0Vn+5rIV6ao/Gi6YW64o08Yo+7WAYq 6ctCQZMuSXkvd9Uxib980An0fKona6ZjUXqXc=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=QXdoYYjGV2lHXozBEvtf1gEp5BePmVgW7E716rDgPmNXCSpMiZUgEgh91kCxWhPbft cJ0EjRH+0Cl5HXMioSDijcsN5hv5p+e2FpMK2KTVrVkt2FH2hJAeDnkV//Cj5I0S6Xm6 /tBUfHSx1OeqNn6Ocv+OvgDo8XBmLmrMJTcX8=
On Fri, Aug 13, 2010 at 7:59 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2010/8/13 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>> 2010/8/13 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
>>> Ok, once again. Since Ilya mentioned that CMake does create projects
>>> which have defaults set to "no whole program optimization" I needed to
>>> start digging again because it implies, that our unit tests should
>>> have captured this!
>>
>> So, are you saying that they are not capturing this in this setup?
>> In that case, please add Ilya's test case to our tests.
>>
>> I think I see where the problem is, and it might be on our side.
>>
>> First of all, let's repeat that this code is really meant to use only
>> registers, not memory, so having an access violation there is really
>> strange.
>>
>> Gael's code there is meant to adapt to the number of registers
>> available: these are these if(nr==4) where nr is a compile-time
>> constant set to
>>
>> nr = NumberOfRegisters/4,
>>
>> in the same file GeneralBlockPanelKernel.h.
>>
>> Now we can see in the assemby that you pasted,
>>
>> if(nr==4) traits.acc(C3, alphav, R3);
>> 00123465 movapd xmm6,xmm0
>> 00123469 mulpd xmm6,xmmword ptr [C3]
>> 0012346E addpd xmm6,xmm7
>> 00123472 add dword ptr [ebp-3Ch],20h
>> 00123476 add eax,dword ptr [k]
>>
>> that the condition nr==4 evaluated to true.
>>
>> This means that NumberOfRegisters==16.
>>
>> This is wrong on 32bit x86 !!!!! This constant should be set to 8. The
>> value 16 is correct for x86-64.
>>
>> So:
>> 1) can you confirm that NumberOfRegisters==16 in your setup that
>> reproduces the crash?
>> 2) let's find out why this NumberOfRegisters is miscomputed.
>>
>> If this is confirmed to be the cause of the crash, then we can be very
>> thankful that this crash occurred as having a wrong NumberOfRegisters
>> means inefficient code (accessing memory in inner loop).
>
> Fixed.
>
> The faulty code was:
>
> #ifndef EIGEN_ARCH_DEFAULT_NUMBER_OF_REGISTERS
> #if (defined __i386__)
> #define EIGEN_ARCH_DEFAULT_NUMBER_OF_REGISTERS 8
> #else
> #define EIGEN_ARCH_DEFAULT_NUMBER_OF_REGISTERS 16
> #endif
>
> apparently MSVC 2010 (and perhaps other versions of MSVC) doesn't
> define __i386__ so this was miscomputed as 16.
>
> Fixed by using sizeof(void*) as a much more universal way of telling
> whether we have a 32bit or 64bit arch.
>
> It would be useful now to scan Eigen's sources for other places where
> we'd be misusing __i386__. I think I've known for a long time that it
> wasn't universally supported, but this example shows how tricky this
> can be!
Great job Benoit!
it is still unclear why it crashed....
gael
> Benoit
>
>>
>> Cheers,
>> Benoit
>>
>>
>>
>>
>> Benoit
>>
>>>
>>> So, we see a regression and that is happening since
>>>
>>> changeset: 3260:431547872cf7
>>> user: Gael Guennebaud <g.gael@xxxxxxx>
>>> date: Mon Jul 12 16:31:46 2010 +0200
>>> summary: matrix product: move the alpha factor to gebp instead of
>>> the packing
>>>
>>> Since here, we are hitting a compiler bug and considering that
>>> everybody using CMake will run into it, it is definitely a show
>>> stopper for 3.0.
>>>
>>> I have still no idea for workarounds and just did not myself hit this
>>> bug earlier because I am working with 64bit builds only.
>>>
>>> - Hauke
>>>
>>>
>>>
>>
>
>
>