Re: [eigen] Significant perf regression probably due to bug 363 patches

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


OK, I've taken a small portion of the overall benchmark that shows particularly much degradation and moved it and all its dependancies into a single source file for easy use.  I've noticed this particular bit of code uses Map<> to avoid resizing checks, perhaps that's where the inlining is going wrong? 


The benchmark is a stochastic gradient descent (trying to find a discriminative linear dimension reduction).  You can control the datatype (float/double) and the low-dimension dimensionality by preprocessor directive i.e. on the compiler command line.  It uses mt19937 to generate a sample dataset, and doing that using boost is nicest since that results in the same dataset across compilers, but if you don't want to, you can use the TR1/C++0x implementation by the compiler instead (toggles via preprocessor directive). 

Example usage: g++ GgmLvqBench.cpp -std=c++0x  -DNDEBUG  -DLVQFLOAT=float -DLVQ_LOW_DIM_SPACE=4 -O3 -march=native

When run, the best timing of 10 runs is written to standard out, and 4 error rates for each of the 10 runs are written to standard error; the 4 error rates represent accuracies during training and should generally decrease.
You can define EIGEN_DONT_VECTORIZE to disable eigen's vectorization
You can define NO_BOOST to use the mt19937 implementation provided by the compiler and not boost's, however the generated dataset may differ and this may (very slightly) affect performance
You can define LVQFLOAT as float or double; computations will use that type; (default: double)
You can define LVQ_LOW_DIM_SPACE as some fixed number in range [2..19] (default:2) which controls the number of dimensions the algorithm will work in.  Other positive numbers might work too.


--eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163


On Fri, Nov 4, 2011 at 12:57, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
Hi Eamon

This is very concerning; can you share your benchmark so that I can
try myself? I would like to understand why it's slow.

Thanks,
Benoit

2011/11/4 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
> On a benchmark I use to check the performance of a simple gradient descent
> algorithm, I've noticesd large performance degradation over the last month.
>
> NV benchmark results use eigen without vectorization, V results are with
> vectorization.  The timings represent best-of-several runs and exhibit
> essentially no variation.
>
> Before changeset 4285 (25ba289d5292) Bug 363 - check for integer overflow in
> byte-size computations:
> LvqBenchNV on GCC: 1.22s
> LvqBenchV on GCC: 0.991s
> LvqBenchNV on MSC: 2.39s
> LvqBenchV on MSC: 1.64s
>
> Locally patched with some EIGEN_STRONG_INLINE for MSC; Before changeset 4285
> (25ba289d5292) Bug 363 - check for integer overflow in byte-size
> computations:
> LvqBenchNV on GCC: 1.21s
> LvqBenchV on GCC: 0.991s
> LvqBenchNV on MSC: 1.75s
> LvqBenchV on MSC: 1.35s
>
> After changeset 4309 (93b090532ed2) Mention that the axis in AngleAxis have
> to be normalized.:
> LvqBenchNV on GCC: 1.53s
> LvqBenchV on GCC: 1.41s
> LvqBenchNV on MSC: 2.42s
> LvqBenchV on MSC: 1.74s
>
> Locally patched with some EIGEN_STRONG_INLINE for MSC; After changeset 4309
> (93b090532ed2) Mention that the axis in AngleAxis have to be normalized.:
> LvqBenchNV on GCC: 1.52s
> LvqBenchV on GCC: 1.41s
> LvqBenchNV on MSC: 1.97s
> LvqBenchV on MSC: 1.64s
>
> This represents a 42% slowdown for the fastest gcc (4.6.2 svn) results and a
> 21% slowdown for the fastest MSC results.
>
> Since the benchmark mostly consists of simple muls and adds on matrices of
> size Nx1, 2x1, 2x2 and 2xN, each individual operation is just a few floating
> point operations and thus overhead is very relevant.  I'm guessing this is
> due to the bug 363 related checkins; the extra checking is possibly hamping
> inlining and possibly just represents a significant number of operations in
> relation to the otherwise cheap matrix ops.  The slow version has the inline
> for check_rows_cols_for_overflow already, so that's not it.
>
> Perhaps a flag disabling all checking would be nice; especially if there are
> any other such checks which might be removed...
>
> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>



/*#include "standard.h"
#include "diagUpdateBench.h"
#include "projectionBench.h"
#include "subtractBench.h"
#include "matmulTest.h"
#include "copyVecTest.h"
#include "prodNormTest.h"
#include "resizeTest.h"
#include "covariance.h"

#include <vector>
#include <fstream>
//from http://www.codeproject.com/KB/files/filesize.aspx
int file_size(const char* sFileName)
{
  std::ifstream f;
  f.open(sFileName, std::ios_base::binary | std::ios_base::in);
  if (!f.good() || f.eof() || !f.is_open()) { return 0; }
  f.seekg(0, std::ios_base::beg);
  std::ifstream::pos_type begin_pos = f.tellg();
  f.seekg(0, std::ios_base::end);
  return static_cast<int>(f.tellg() - begin_pos);
}


int main(int , char*argv []){ 
	//std::vector<Vector2d> stlvec;
	cout<<"EigenBench";
#ifdef EIGEN_DONT_VECTORIZE
	cout<< "NV";
#else
	cout<< "V";
#endif
#ifndef NDEBUG
	cout<< "[DEBUG]";
#endif
#ifdef _MSC_VER
	cout << " on MSC";
#else
#ifdef __GNUC__
	cout << " on GCC";
#else
	cout << " on ???";
#endif
#endif
	cout<<": ";
	docovbench();

	cout << file_size(argv[0])/1024 <<"KB\n"; //resizeTest() <<"s; "

	return 0;
}

*/


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/