Re: [eigen] Matrix product crashes when compiled with MSVC 2010 in release

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2010/8/13 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
> I already tried all simple things such as using directly the SSE
> intrinsics and introducing temporaries. It is just a nasty
> optimization error of MSVC.

ok. Then go the asm way? Note that SSE/PacketMath.h still contains
commented-out asm for gcc, for ei_pmadd.

Benoit

>
> - Hauke
>
> On Fri, Aug 13, 2010 at 4:12 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>> 2010/8/13 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
>>> 2010/8/13 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
>>>> On Fri, Aug 13, 2010 at 4:00 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
>>>>> 2010/8/13 Hauke Heibel <hauke.heibel@xxxxxxxxxxxxxx>:
>>>>> Unfortunately, the patch creates a wrong result:
>>>>>
>>>>> +#if defined(EIGEN_VECTORIZE) && defined(_MSC_VER) && !defined(_WIN64)
>>>>> +    const AccPacket tmp1 = ei_pmul(alpha,c);
>>>>> +    const AccPacket tmp2 = ei_pmul(alpha,r);
>>>>> +    r = ei_padd(tmp1,tmp2);
>>>>> +#else
>>>>>     r = ei_pmadd(c,alpha,r);
>>>>> +#endif
>>>>>
>>>>> Here, the current code r=pmadd(c,alpha,r) does (in pseudocode):
>>>>>
>>>>>    r = alpha*c+r.
>>>>>
>>>>> But your variant does instead:
>>>>>
>>>>>    r = alpha*c + alpha*r
>>>>
>>>> Ups, too hasty.
>>>
>>> Here's a fun counter-proposal: if the issue is really that MSVC
>>> mis-compiles ei_pmadd, how about implementing it with asm volatile?
>>
>> Ah, something to try first: in this acc() method, try replacing this
>> ei_pmadd(x,y,z) by
>>
>>    ei_padd(ei_pmul(x,y), z)
>>
>> i.e. just don't use our ei_pmadd function. My suspiscion is that
>> ei_pmadd might be mis-compiled because it is defined (in
>> genericpacketmath.h) before the ei_padd and ei_pmul receive their
>> specializations (under arch/SSE/) i.e. that would again be just a
>> problem with templates support in MSVC.
>>
>> Benoit
>>
>>>
>>> Benoit
>>>
>>
>>
>>
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/