Re: [eigen] Help needed to run a benchmark on many machines

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]




2015-02-18 15:28 GMT-05:00 Hauke Heibel <hauke.heibel@xxxxxxxxx>:
Hi,

I have a few questions regarding the tests...

a) Do you need more tests on other machines?

Yes please. I could use lots more. Even if the machines turn out to be close enough that results are similar, that's useful nontrivial information by itself.
 
b) Do the machines/devices really matter, or are only the sizes of the data caches important?

It's definitely not just the cache sizes, or even the cache structure. What exactly matters, I don't know. But I'm seeing lots of weird things in results that show that it's not just the cache structure. One can hazard a guess that the prefetching behavior is very important, especially after I had to tweak our prefetch instructions in Bug 958 to achieve better performance on mobile, and in 'perf annotate' I keep seeing load instructions getting blamed.
 
c) Do we actually need to test user CPU times as opposed to wall times?

Probably not - do you have an opinion on this? On Mac, where clock_gettime is not available, the test defaults to gettimeofday, i.e. wall time, and that works just as well. Should I just use gettimeofday everywhere?
 

I asking about b) because of your suggestion to provide tables per device and not based on the device's cache configurations. 

Indeed. I've been pulling my hair to come up with a reasonable heuristic that would find the optimal blocking sizes from just information about the cache structure. Then I started looking at actual measurements such as recorded by this program, and it's clear that we won't be able to find sane heuristics that work everywhere.
 

Regarding c), boost has a special clock in its chrono implementation (see here: http://goo.gl/ETH4Bk). Maybe there is a profiling pro over here who can shed some light on this?

Please!
Benoit
 

Regards,
Hauke



On Wed Feb 18 2015 at 8:57:15 PM Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
2015-02-18 12:22 GMT-05:00 Hauke Heibel <hauke.heibel@xxxxxxxxx>:
Hi Benoit,


I am running the test right now. I needed to add a few changes (see
attached main.cpp). Classical windows specific stuff.

I've looked at the diff between your file and mine, but it's very big because of many whitespace/style changes.
If you can send me a minimal diff, I'll happily apply it.

Benoit
 

I also needed to change std::min<SizeType> to std::min<Index> in the
default branch.

I will let you know once the test finishes.

Regards,
Hauke

On Wed, Feb 18, 2015 at 5:58 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> Hi List,
>
> I'm looking into tuning Eigen's matrix product blocking parameters on
> various machines. That involves actually measuring the impact of different
> parameter values on performance. The tracking bug for this effort is
> http://eigen.tuxfamily.org/bz/show_bug.cgi?id=937
>
> I'm attaching a benchmark to this email. It tests all sorts of matrix
> product sizes with all sorts of blocking parameters, and outputs both a
> complete log and a table that might be what we need to directly provide
> Eigen with.
>
> It would be very helpful if some people could help me run it on different
> machines. It runs at least on:
>  - GNU Linux
>  - Android
>  - Mac
>
> If you run it, please record its output into a text file and send it back to
> me together with some description of your device (e.g. "nexus 6"). The
> benchmark already records some hardware info such as /proc/cpuinfo on linux.
>
> Please also tell me your Eigen changeset, compiler and command line.
>
> Compilation instructions:
>  - most important: please use today's devel branch of Eigen - must have the
> patch from http://eigen.tuxfamily.org/bz/show_bug.cgi?id=958 .
>  - c++11 mode
>  - on intel, please compile with -mavx if available
>  - on ARM, please compile with -mfpu=neon-vfpv4 if available, otherwise
> -mfpu=neon for older devices. DO NOT pass a -march flag as that triggers
> compiler bugs spilling registers.
>  - on ARM, bonus points if you can also apply the patch from
> http://eigen.tuxfamily.org/bz/show_bug.cgi?id=955 - not necessary but
> better.
>
> Example desktop command line:
> c++ -mavx -DNDEBUG -O3 --std=c++0x benchmark-blocking-sizes.cpp -o b -I
> ../eigen && ./b | tee log-blocking-sizes-mac
>
> Example Android command line:
> $CXX $SRC -o $EXE -save-temps \
>  -O3 -DNDEBUG \
>  --std=c++0x -Wall -Wextra -pedantic \
>  -fPIE -pie -mfpu=neon-vfpv4 -mfloat-abi=softfp \
>  -I $HOME/eigen
>
> Thanks!
> Benoit
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/