Re: [eigen] Matrix product crashes when compiled with MSVC 2010 in release |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Matrix product crashes when compiled with MSVC 2010 in release
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Fri, 13 Aug 2010 13:59:35 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=cvs6Y7ZiU74p0d1qPdUqVU2rQRRuUzqEkMovBvqIAFE=; b=e7or8pzefGGW/wkh/JeRXSk9M0jJdQTV1pknpEG6hJgAefNRKSD7oT19oOnds+8sVP Q4Aq30hxlTFhz3WJRiaY0t1PQV2ZsCtSV2K5BzxLTro7f9r50rUtTCXQdt8Nk4gaNY1A SNpeVPRsQJlGQYnduIFje7LiMG0kNhhNBjB3Q=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=rrCS0bZP1HoUsLQYjoJfhvcO/gDzWwio5740WyrRB3ZxQATtaIpOdOdSeBmbTYKiir onEp3Oqfdoi9YOlQ4vj4ZVIB5wuuGYmrbtjArGa4tpfvN5wC07alpT5EV6bmUXjb5O4A j/+msTxcLiUh9swrJTYTqP3fdcLLpwqq3Thbk=
2010/8/13 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> 2010/8/13 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
>> Ok, once again. Since Ilya mentioned that CMake does create projects
>> which have defaults set to "no whole program optimization" I needed to
>> start digging again because it implies, that our unit tests should
>> have captured this!
>
> So, are you saying that they are not capturing this in this setup?
> In that case, please add Ilya's test case to our tests.
>
> I think I see where the problem is, and it might be on our side.
>
> First of all, let's repeat that this code is really meant to use only
> registers, not memory, so having an access violation there is really
> strange.
>
> Gael's code there is meant to adapt to the number of registers
> available: these are these if(nr==4) where nr is a compile-time
> constant set to
>
> nr = NumberOfRegisters/4,
>
> in the same file GeneralBlockPanelKernel.h.
>
> Now we can see in the assemby that you pasted,
>
> if(nr==4) traits.acc(C3, alphav, R3);
> 00123465 movapd xmm6,xmm0
> 00123469 mulpd xmm6,xmmword ptr [C3]
> 0012346E addpd xmm6,xmm7
> 00123472 add dword ptr [ebp-3Ch],20h
> 00123476 add eax,dword ptr [k]
>
> that the condition nr==4 evaluated to true.
>
> This means that NumberOfRegisters==16.
>
> This is wrong on 32bit x86 !!!!! This constant should be set to 8. The
> value 16 is correct for x86-64.
>
> So:
> 1) can you confirm that NumberOfRegisters==16 in your setup that
> reproduces the crash?
> 2) let's find out why this NumberOfRegisters is miscomputed.
>
> If this is confirmed to be the cause of the crash, then we can be very
> thankful that this crash occurred as having a wrong NumberOfRegisters
> means inefficient code (accessing memory in inner loop).
Fixed.
The faulty code was:
#ifndef EIGEN_ARCH_DEFAULT_NUMBER_OF_REGISTERS
#if (defined __i386__)
#define EIGEN_ARCH_DEFAULT_NUMBER_OF_REGISTERS 8
#else
#define EIGEN_ARCH_DEFAULT_NUMBER_OF_REGISTERS 16
#endif
apparently MSVC 2010 (and perhaps other versions of MSVC) doesn't
define __i386__ so this was miscomputed as 16.
Fixed by using sizeof(void*) as a much more universal way of telling
whether we have a 32bit or 64bit arch.
It would be useful now to scan Eigen's sources for other places where
we'd be misusing __i386__. I think I've known for a long time that it
wasn't universally supported, but this example shows how tricky this
can be!
Benoit
>
> Cheers,
> Benoit
>
>
>
>
> Benoit
>
>>
>> So, we see a regression and that is happening since
>>
>> changeset: 3260:431547872cf7
>> user: Gael Guennebaud <g.gael@xxxxxxx>
>> date: Mon Jul 12 16:31:46 2010 +0200
>> summary: matrix product: move the alpha factor to gebp instead of
>> the packing
>>
>> Since here, we are hitting a compiler bug and considering that
>> everybody using CMake will run into it, it is definitely a show
>> stopper for 3.0.
>>
>> I have still no idea for workarounds and just did not myself hit this
>> bug earlier because I am working with 64bit builds only.
>>
>> - Hauke
>>
>>
>>
>