Re: [eigen] Status of AVX support |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Status of AVX support
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Wed, 7 Dec 2011 18:33:17 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=H5f4pQ+30+PPoCWIbF3SC9glyP96v98Pfrnfi4Egz5E=; b=GlvbgTsAV+TbIvv4jyes8b3aFOprUQVHZimC8Q4+Oe6ERQ8YwgNpzGCxm3BcOkAtUB bXx36sPuhQ0YqNkZdI6bJnHZb4/QLjzXjlOcc294n37GzkO+40BPgm0qdbnp67WewWv4 tHIpnqIEycZnLX1eMQsNSlJSdPNiONXknNRaU=
Note that you can work onto half of a register using AVX instructions.
So actually, the AVX backend will be completely separated and
exclusive with the SSE one. The main real issue is the required 32 bit
alignment.
An idea would be to replace (extend) the (Auto)Aligned keywords by
(Auto)Aligned16 and (Auto)Aligned32 keywords that could be used with
Map and Matrix. The default of Matrix will still be Aligned16 for ABI
compatibility and limited memory overhead.
gael
On Wed, Dec 7, 2011 at 5:31 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2011/12/7 Rhys Ulerich <rhys.ulerich@xxxxxxxxx>:
>>> W.r.t porting to AVX: Be aware that there might be some pitfalls with
>>> AVX-performance:
>>> http://www.agner.org/optimize/blog/read.php?i=142
>>
>> Interesting tidbit from that link "If the programmer inadvertently
>> mixes AVX and non-AVX vector instructions in the same code then there
>> is a penalty of 70 clock cycles for each transition between the two
>> forms."
>
> Between this, and the fact that we can't 32-byte-align Vector4d
> without breaking the ABI, I'm starting to wonder if maybe we should
> treat AVX as a dynamic-size-only thing and completely give up on AVX
> for fixed-size objects? For dynamic-size objects, the situation is
> much simpler, we can increase the alignment without breaking the ABI
> and we can assume that objects are large so that AVX is always better
> than SSE.
>
> In any case, I think we should start by doing AVX for dynamic-size
> objects only, it will be time to think about fixed-size later.
>
> Benoit
>
>>
>> Thank you for the pointer to the blog,
>> Rhys
>>
>>
>
>