Re: [eigen] Benchmarking

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Do you have experience timing things at that level (nanoseconds, that is)?

If you're timing things at the microsecond level, you'll get interference from cache effects and possibly the scheduler, though you tried to prevent that (does that prevent I/O kernel time too?).  It is odd that that you consistently see lower performance for several loop iterations, however, since that's not normal cache behavior.  Another factor you might be running into: Power-saving cpu speed reduction.  If your clock speed is throttled, it may well take a while before the heuristics decide load is high enough to unthrottle - or maybe your CPU is hyperthreaded and sharing a core with another expensive task initially..  And of course, depending on the details, you might be running into other weirdness too such as denormalized floating points and NaN/Inf values..

Generally, I make my loops long enough to reach the millisecond range, and then re-run them several times; even then you see some possibly scheduler-related variability.

--eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163


On Tue, Sep 7, 2010 at 08:15, Daniel Stonier <d.stonier@xxxxxxxxx> wrote:
Hi lads,

I've been trying to benchmark eigen2 and eigen3's geometry modules
recently just to get an idea of the speed we can run various
structures at, but I'm having a hard time getting consistent results
and thought you might be able to lend some advice.

Typically, I do things in the following order on a linux platform with
rt timers (ie  clock_gettime(CLOCK_MONOTONIC,...))

###########################################
set the process as a real time priority posix process
select transform type
begin_loop
 - fill transform with random data
 - timestamp
 - do a transform product
 - timestamp again
 - push time diff onto a queue
repeat
do some statistics
###########################################

The times I have coming out are extremely inconsistent though:

- if repeating only 100 times, the product might come out with times
of ~840-846ns one run, then sometimes 300-310ns on another run.
- if repeating 10000 times, it will run at ~840ns for a long time,
then jump down and run at 300-310ns for the remainder.
- running other tests in the loop as well (taking separate timestamps
and using multiple queues) can cause the calculation time to be very
different.
 - e.g. this test alone produces results of ~600ns, mingled with
other tests it is usually ~840ns.

Some troubleshooting:

- it is not effects from multi-core as the same problems happen when
using taskset to lock it onto a single core.
- it shouldn't be from the scheduler either because it is an elevated
posix real time process.

I'm baffled. Would really love to know more about how my computer
processes in such a humanly erratic fashion and what's a good way of
testing that.

Cheers,
Daniel Stonier.

--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/