Re: [eigen] Eigen AVX support

Re: [eigen] Eigen AVX support - first steps

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen <eigen@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [eigen] Eigen AVX support - first steps
From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
Date: Mon, 15 Apr 2013 17:18:07 +0200
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=1QpuaHzRvZmL+k0H6n5XNjnwieqyRXUUggBS6sQMceM=; b=eziH9NQ37im3sta3iT/1AN/dl9Cz4/tFh7LQbGOzl/GT1SKtqzPfNEE9xnbmHBuHGO PIf3flmwY5Tp0vslqIc8s+UMIbCezhVbTBsdpb1Jsu0X3IYKcxNyS+vb5lNZyCwCH3PA HguxVd3rJmbeaBxaCEBQFty3JlzGtyfZCVwiHyQwpvRztlpUpvKCwDMx6t/mAD8jK/84 mdkwyxCHy1399jHiCcHpHexXRHw1l3YzZI6E4g7PMfAM1KpIFLjT25/uh8euliuSfNQn Efy104/Zm4Y4UB33of/dOThDcJGWbiCfm9z6c7LikIpPzfu+zBIuZXSIlU/41AAVGqQq imzw==

questions.txt:


1. what does loaddup do?

It is needed for compatibility with complexes, it load PacketSize/2
scalars and copy them into a packet where each scalar is duplicated,
e.g., for Packet8f:

A,B,C,D -> A,A,B,B,C,C,D,D.

2. what is the purpose of palign_impl?

That's the most tricky one. It is only used to optimize matrix-vector
products on unligned matrices. It takes 2 packets that represent a
contiguous memory array, and return a packet starting at the position
offset, e.g., for Packet4d

Inputs:
{A0,B0,C0,D0} ; {A1,B1,C1,D1}

if Offset==0 => {A0,B0,C0,D0}
if Offset==1 => {B0,C0,D0,A1}
if Offset==2 => {C0,D0,A1,B1}
if Offset==3 => {D0,A1,B1,C2}

For Packet8f .... well I have to think about it as considering all
possibilities might be overkill. We can easily discard this
optimization for PacketSize>4.


3. How are we defining the __m256d type?

+ I am assuming this type will be wrapped up in a union with an array
of doubles.
+ Where is that definition going to go? In the PacketMathDouble.h or
in some other
+ file? The name of the array will decide what we use in the pfirst
function. GCC
+ does not appear to have wrapped the __m256d type as a union.

hm... is it our job to define __m256d ?? Isn't it defined by  in the
AVX intrinsics header fiels??


Cheers,
Gael.


On Mon, Apr 15, 2013 at 5:06 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> Hi Rohit,
>
> thank you for the hard work.
>
>
> On Sun, Apr 14, 2013 at 6:47 AM, Rohit Garg <rpg.314@xxxxxxxxx> wrote:
>> I have pushed some code to the eigen-avx repository at bitbucket.
>>
>> a) All the additions reside in the AVX folder, along with altivec, neon and
>> SSE. I have split up the single and double precision code into two files as
>> one file was getting too big.
>
> I'm not sure decoupling float and double really helps readability
> since in 99% of the cases they should be extremely similar, but ok.
>
>> b) The integer code has been removed as AVX does not have int support. Once
>> real numbers are done, we can move on to complex number support.
>
> indeed, that's for AVX2 that should be available in coming soon CPUs.
>
>> c) I had a few questions about some of the intrinsic functions, I have
>> written them in the questions.txt file. in the AVX folder.
>
> I'll answer them in a second email.
>
>> d) So far, I have just migrated the intrinsic functions from SSE over to
>> AVX. All my changes are so far limited to the AVX folder in the arch folder.
>> I have not run any tests and this code is not hooked up to the rest of the
>> eigen code base as yet. The reduction functions have been tested separately,
>> so they should be fine.
>
> Look at the Eigen/Core header file. Before testing for SSE, if __AVX__
> is defined then we should define a EIGEN_VECTORIZE_AVX token that will
> be used later to include your files instead of the ones in SSE. Then,
> in CMakeLists.txt, you can add an option to enable AVX in unit tests,
> and start with the packet_math unit tests.
>
> I guess we well also have to move the alignement requirement to the
> packet_traits instead of the somewhat hardcoded 16 bytes. For initial
> testing though, you can make sure that pload and pstore also work on
> 16bytes aligned data.
>
>> e) I have made no attempt for micro-optimization so far. Once this works we
>> can move to optimization.
>
> sure!
>
> gael
>
>> f) Code review welcome. :)
>>
>> Cheers,
>>
>> --
>> Rohit Garg
>>
>> http://rpg-314.blogspot.com/
>>
>> Graduate Student
>> Applied and Engineering Physics
>> Cornell University

Follow-Ups:
- Re: [eigen] Eigen AVX support - first steps
  - From: Rohit Garg

References:
- [eigen] Eigen AVX support - first steps
  - From: Rohit Garg
- Re: [eigen] Eigen AVX support - first steps
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Eigen AVX support - first steps
Next by Date: Re: [eigen] Using SparseLU to calculate the determinant.
Previous by thread: Re: [eigen] Eigen AVX support - first steps
Next by thread: Re: [eigen] Eigen AVX support - first steps

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/