|Re: [eigen] Indexes: why signed instead of unsigned?|
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
On 5/17/10, leon zadorin <leonleon77@xxxxxxxxx> wrote: > On 5/17/10, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote: >> ok so I dumped the assembly for my loop in the mul case. >> >> >> For type int: >> >> .L76: >> #APP >> # 28 "a.cpp" 1 >> #begin >> # 0 "" 2 >> #NO_APP >> leal (%rdx,%rax), %eax >> addl $1, %ebx >> imull %edx, %eax >> imull %eax, %ebx >> #APP >> # 33 "a.cpp" 1 >> #end >> # 0 "" 2 >> #NO_APP >> addl $1, %edx >> cmpl $500000000, %edx >> jne .L76 >> >> >> For type std::ptrdiff_t: >> >> .L91: >> #APP >> # 28 "a.cpp" 1 >> #begin >> # 0 "" 2 >> #NO_APP >> leaq (%rdx,%rax), %rax >> addq $1, %rbx >> imulq %rdx, %rax >> imulq %rax, %rbx >> #APP >> # 33 "a.cpp" 1 >> #end >> # 0 "" 2 >> #NO_APP >> addq $1, %rdx >> cmpq $500000000, %rdx >> jne .L91 >> >> >> I don't see anything here biasing the results, and the two versions >> are similar, but i'm no assembly language expert, perhaps you see >> something i don't. The two versions run in the same amount of time >> here, so I'm tempted to believe that imulq is as fast as imull. >> >> Could you dump your assembly to see if some optimization happened on your >> side? > > Sure thing. In fact -- I should learn to get some sleep before > posting, I think my int64_t part of code was insane, or at the very > least inappropriate for the test anyway -- I shall readjust the code > and dump the assembly as soon as I get a chance. Sorry for the noise > in the meantime. OK, I've just had spare 20 minutes at work-desk so I though I might as well spit out some numbers for you... I just hope and I haven't rushed this one (let me know if you see something obviously stupid). Attached are 2 files: a.cpp (modified with CLOCK_PROF et al) as it was compiled for the logs; second file is a.log: contains timings and assembly dump (generated by 'c++ -S -OX a.cpp') for 2 compilation targets: one with -O0 and another with -O3. I haven't had the time to actually peruse the assembly for -O0 and -O3 (sorry). I can say from previous tests that -O2 produces similar results w.r.t. 'mul' being magnitude of times slower for 64 bits ints. Although, irrespective of what the compiler does in terms of it's optimizations (because such optimizations may not be disabled for the benefits they may introduce in other places of ones code/project), I feel that there also may not be enough "anti-static/anti-compiletime" mechanisms in the test code of mine... I shall try to look into it a bit more later on... if I'll find some time. Kind regards Leon.
Description: Binary data
Description: Binary data
|Mail converted by MHonArc 2.6.19+||http://listengine.tuxfamily.org/|