[eigen] Performance tuning

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


I'm writing to ask about people's favorite techniques for tuning the 
performance of their code, and in particular whether there exists a HOWTO (or 
a desire to write one). 

Here's one example: for the kinds of problems I'm studying, I've found that 
one of the most helpful steps is to avoid the creation of temporaries. Because 
temporaries are usually created "silently," I am sometimes surprised by which 
lines are triggering temporaries. Once I discovered that this was an issue, 
here's how I went about solving the problem at first:
1. Compile the program with debugging symbols and optimization
2. Profile my code in valgrind
3. See how many times posix_memalign gets called.
4. If it's too many, comment out a suspicious-looking block of code.
5. Go back to step 1 until the problem is localized, then fix it.
But after growing tired of this process, I finally realizing that a better way 
to discover the problematic line(s) is to set a breakpoint in 
Core/Util/Memory.h::aligned_realloc. And, of course, maybe there's even a 
better way I haven't yet stumbled across, like setting some compiler flag to 
throw an exception upon the creation of a temporary?

But there are other issues I don't know how to solve. For example, in 
valgrind, I frequently wish that the cost of inlined code could be "folded in" 
to the cost computed for the individual lines of my source code. Does anyone 
know of a way of doing that? Or is there a different tool that can do this?

A natural gathering place for useful tips might be 
but currently that section is fairly short.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/