Re: [eigen] Benchmarking

[ Thread Index | Date Index | More Archives ]

2010/9/7 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> 2010/9/7 Eamon Nerbonne <eamon.nerbonne@xxxxxxxxx>:
>> Do you have experience timing things at that level (nanoseconds, that is)?
>> If you're timing things at the microsecond level, you'll get interference
>> from cache effects
> Ah true, the cost of a single RAM access is non-negligible compared to
> 1 microsecond... making it forever irrelevant to benchmark at that
> level! At least as long as RAM is involved.
>> and possibly the scheduler, though you tried to prevent
>> that (does that prevent I/O kernel time too?).  It is odd that that you
>> consistently see lower performance for several loop iterations, however,
>> since that's not normal cache behavior.  Another factor you might be running
>> into: Power-saving cpu speed reduction.  If your clock speed is throttled,
>> it may well take a while before the heuristics decide load is high enough to
>> unthrottle - or maybe your CPU is hyperthreaded and sharing a core with
>> another expensive task initially.
> All of that should be taken care of by using
> clock_gettime(CLOCK_PROCESS_CPUTIME_ID).
>>  And of course, depending on the details,
>> you might be running into other weirdness too such as denormalized floating
>> points and NaN/Inf values.
> Right --- but that isn't specific to timing on a small scale. Can ruin
> a day-long benchmark, too.
>> Generally, I make my loops long enough to reach the millisecond range, and
>> then re-run them several times; even then you see some possibly
>> scheduler-related variability.
> Yes, being in the millisecond range is needed to get something
> 'statistically significant' wrt RAM accesses.

For the record: yes running in the millisecond range is needed wrt RAM
accesses, but no I don't think that 'scheduler variability' is a
potential problem as that should be completely taken care of by


> My other 'trick' is to just use a good profiler that uses the cpu's
> performance counters. Allows to benchmark any code without having to
> modify it... On recent linux kernels, use 'perf'.
> Benoit
>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>> On Tue, Sep 7, 2010 at 08:15, Daniel Stonier <d.stonier@xxxxxxxxx> wrote:
>>> Hi lads,
>>> I've been trying to benchmark eigen2 and eigen3's geometry modules
>>> recently just to get an idea of the speed we can run various
>>> structures at, but I'm having a hard time getting consistent results
>>> and thought you might be able to lend some advice.
>>> Typically, I do things in the following order on a linux platform with
>>> rt timers (ie  clock_gettime(CLOCK_MONOTONIC,...))
>>> ###########################################
>>> set the process as a real time priority posix process
>>> select transform type
>>> begin_loop
>>>  - fill transform with random data
>>>  - timestamp
>>>  - do a transform product
>>>  - timestamp again
>>>  - push time diff onto a queue
>>> repeat
>>> do some statistics
>>> ###########################################
>>> The times I have coming out are extremely inconsistent though:
>>> - if repeating only 100 times, the product might come out with times
>>> of ~840-846ns one run, then sometimes 300-310ns on another run.
>>> - if repeating 10000 times, it will run at ~840ns for a long time,
>>> then jump down and run at 300-310ns for the remainder.
>>> - running other tests in the loop as well (taking separate timestamps
>>> and using multiple queues) can cause the calculation time to be very
>>> different.
>>>  - e.g. this test alone produces results of ~600ns, mingled with
>>> other tests it is usually ~840ns.
>>> Some troubleshooting:
>>> - it is not effects from multi-core as the same problems happen when
>>> using taskset to lock it onto a single core.
>>> - it shouldn't be from the scheduler either because it is an elevated
>>> posix real time process.
>>> I'm baffled. Would really love to know more about how my computer
>>> processes in such a humanly erratic fashion and what's a good way of
>>> testing that.
>>> Cheers,
>>> Daniel Stonier.
>>> --
>>> Phone : +82-10-5400-3296 (010-5400-3296)
>>> Home:
>>> Yujin Robot:
>>> Embedded Control Libraries:

Mail converted by MHonArc 2.6.19+