Re: [eigen] Speed issues, array min,max

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Speed issues, array min,max
From: Gabriel <gnuetzi@xxxxxxxxx>
Date: Mon, 5 Oct 2015 18:08:17 +0200
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-type:content-transfer-encoding; bh=K8CvWRMnvmpQ1hZDhDSs6pLKa+sfWX4Nv36nEP8Ut/4=; b=d9cm/ULcU63mOjpIF6Nao+iP3EaRmAiShQBWF0MdOBvIfbEL68C1P69z49/nL/8nv5 vLTwObvUIxv5NPnHCWzJPmAC5kHL1UgiOPz47rg1Ak6fA3VgSfC2t50eGopcTBuMEODP Gnfwgn6AGIkb/xrZNhRtbxxDcVxVMVPrkeOME9E+cLFppusO25fxhzQ8TI1jOURkL+vU bLduMKprrhjDHIrupQlU57CIcI5EYd4t/a2n6WqMIKBpAN4NEm28KPFxV7mGmz5k8AZm zKTNzjtYsZIEphWHuSODTZZtarGlpyTzS/LZIJWCDHVPPs08G/xMQcpxESkrh43Qu4Nr QnYQ==

Thanks a lot!
I thought something is flawed, but

I compared the assembler output with comments of the following(hopefully unflawed problem, random numbers, all the same )


http://pastebin.com/qDvLGBfU

I get close timings, but Eigen3 is still slower what can also be seenfrom the assembler output belowIs my example still flawed, the Eigen internal dense assignment loop isnot inlined?



(SLOWER)
#BEGIN1
# 0 "" 2
#NO_APP
    leaq    128(%rsp), %r13
    leaq    32(%rsp), %r12
    leaq    64(%rsp), %rbp
    movl    $100000000, %ebx
    .p2align 4,,10
    .p2align 3
..L367:
    call    rand
    subl    $1073741824, %eax
    cltq
    movq    %rax, 64(%rsp)
    call    rand
    subl    $1073741824, %eax
    cltq
    movq    %rax, 72(%rsp)
    call    rand
    leal    -1073741824(%rax), %edx
    leaq    160(%rsp), %rsi
    leaq    96(%rsp), %rdi
    movq    %r13, 160(%rsp)
    movq    %r12, 168(%rsp)
    movslq    %edx, %rdx
    movq    %rbp, 176(%rsp)
    movq    %rdx, 80(%rsp)
    leaq    31(%rsp), %rdx

call_ZN5Eigen8internal26call_dense_assignment_loopINS_5ArrayIxLi3ELi1ELi0ELi3ELi1EEENS_13CwiseBinaryOpINS0_13scalar_max_opIxEEKS3_KNS4_INS0_13scalar_min_opIxEES7_S7_EEEENS0_13add_assign_opIxEEEEvRKT_RKT0_RKT1_

    subl    $1, %ebx
    jne    .L367
#APP

# 123"/home/zfmgpu/Desktop/Repository/SimulationFramework/SourceCode/Projects/TestBench/Projects/Test/src/main.cpp"1

    #END1

(FASTER)
    #BEGIN2
# 0 "" 2
#NO_APP
    movl    $100000000, %ebx
    xorl    %r12d, %r12d
    .p2align 4,,10
    .p2align 3
..L368:
    call    rand
    subl    $1073741824, %eax
    cltq
    movq    %rax, 64(%rsp)
    call    rand
    subl    $1073741824, %eax
    cltq
    movq    %rax, 72(%rsp)
    call    rand
    leal    -1073741824(%rax), %edx
    movq    64(%rsp), %rax
    cmpq    %rax, 32(%rsp)
    cmovle    32(%rsp), %rax
    movslq    %edx, %rdx
    movq    %rdx, 80(%rsp)
    testq    %rax, %rax
    cmovs    %r12, %rax
    addq    %rax, 96(%rsp)
    movq    72(%rsp), %rax
    cmpq    %rax, 40(%rsp)
    cmovle    40(%rsp), %rax
    testq    %rax, %rax
    cmovs    %r12, %rax
    addq    %rax, 104(%rsp)
    movq    48(%rsp), %rax
    cmpq    %rax, %rdx
    cmovg    %rax, %rdx
    testq    %rdx, %rdx
    cmovs    %r12, %rdx
    addq    %rdx, 112(%rsp)
    subl    $1, %ebx
    jne    .L368
#APP

# 138"/home/zfmgpu/Desktop/Repository/SimulationFramework/SourceCode/Projects/TestBench/Projects/Test/src/main.cpp"1

    #END2



On 10/05/2015 04:32 PM, Christoph Hertzberg wrote:

Your example is flawed, since it is trivial enough for the compiler tooptimize away (almost) entirely. AddEIGEN_ASM_COMMENT("begin/end..."); lines and have a look at thegenerated assembler to see what I mean.
If the values of the vectors are not known at compile-time (and notthe same for each iteration), you should get essentially the sameassembler with Eigen and your hand-coded version -- but with lesslines of code.
Christoph

On 05.10.2015 15:21, Gabriel wrote:
Why is this test code so slow for eigen3

(see simple main)
*http://pastebin.com/11XzzNFs*


Output here with gcc 4.9.2 , full optimization is turned on:

*Eigen3: time: 0.150045 ms **
**Seperate: time: 0.000131 ms **
*
So it seems that it is not beneficial to use eigen for this simple index
calculations, but why?


Thanks for the help! :-)

Follow-Ups:
- Re: [eigen] Speed issues, array min,max
  - From: Nate Yonkee
- Re: [eigen] Speed issues, array min,max
  - From: Christoph Hertzberg

References:
- [eigen] Overloading componentwise binary operators for vectors
  - From: Cedric Doucet
- Re: [eigen] Overloading componentwise binary operators for vectors
  - From: Gael Guennebaud
- Re: [eigen] Overloading componentwise binary operators for vectors
  - From: Cedric Doucet
- Re: [eigen] Overloading componentwise binary operators for vectors
  - From: Marc Glisse
- Re: [eigen] Speed issues, array min,max
  - From: Gabriel
- Re: [eigen] Speed issues, array min,max
  - From: Christoph Hertzberg

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Overloading componentwise binary operators for vectors
Next by Date: Re: [eigen] Overloading componentwise binary operators for vectors
Previous by thread: Re: [eigen] Speed issues, array min,max
Next by thread: Re: [eigen] Speed issues, array min,max

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/