Re: [eigen] Significant perf regression probably due to bug 363 patches

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


$ g++ Downloads/GgmLvqBench.cpp -o b -I eigen -O3
Downloads/GgmLvqBench.cpp:30:29: fatal error: Eigen/EigenValues: No
such file or directory
compilation terminated

Eigenvalues, not EigenValues in current Eigen.

Did this compile for you?

Benoit

2011/11/5 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
> Sorry, disregard the previous attachment, you need this one...
>
> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>
>
> On Sat, Nov 5, 2011 at 17:51, Eamon Nerbonne <eamon@xxxxxxxxxxxx> wrote:
>>
>> OK, I've taken a small portion of the overall benchmark that shows
>> particularly much degradation and moved it and all its dependancies into a
>> single source file for easy use.  I've noticed this particular bit of code
>> uses Map<> to avoid resizing checks, perhaps that's where the inlining is
>> going wrong?
>>
>>
>> The benchmark is a stochastic gradient descent (trying to find a
>> discriminative linear dimension reduction).  You can control the datatype
>> (float/double) and the low-dimension dimensionality by preprocessor
>> directive i.e. on the compiler command line.  It uses mt19937 to generate a
>> sample dataset, and doing that using boost is nicest since that results in
>> the same dataset across compilers, but if you don't want to, you can use the
>> TR1/C++0x implementation by the compiler instead (toggles via preprocessor
>> directive).
>>
>> Example usage: g++ GgmLvqBench.cpp -std=c++0x  -DNDEBUG  -DLVQFLOAT=float
>> -DLVQ_LOW_DIM_SPACE=4 -O3 -march=native
>>
>> When run, the best timing of 10 runs is written to standard out, and 4
>> error rates for each of the 10 runs are written to standard error; the 4
>> error rates represent accuracies during training and should generally
>> decrease.
>> You can define EIGEN_DONT_VECTORIZE to disable eigen's vectorization
>> You can define NO_BOOST to use the mt19937 implementation provided by the
>> compiler and not boost's, however the generated dataset may differ and this
>> may (very slightly) affect performance
>> You can define LVQFLOAT as float or double; computations will use that
>> type; (default: double)
>> You can define LVQ_LOW_DIM_SPACE as some fixed number in range [2..19]
>> (default:2) which controls the number of dimensions the algorithm will work
>> in.  Other positive numbers might work too.
>>
>>
>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>
>>
>> On Fri, Nov 4, 2011 at 12:57, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>> wrote:
>>>
>>> Hi Eamon
>>>
>>> This is very concerning; can you share your benchmark so that I can
>>> try myself? I would like to understand why it's slow.
>>>
>>> Thanks,
>>> Benoit
>>>
>>> 2011/11/4 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
>>> > On a benchmark I use to check the performance of a simple gradient
>>> > descent
>>> > algorithm, I've noticesd large performance degradation over the last
>>> > month.
>>> >
>>> > NV benchmark results use eigen without vectorization, V results are
>>> > with
>>> > vectorization.  The timings represent best-of-several runs and exhibit
>>> > essentially no variation.
>>> >
>>> > Before changeset 4285 (25ba289d5292) Bug 363 - check for integer
>>> > overflow in
>>> > byte-size computations:
>>> > LvqBenchNV on GCC: 1.22s
>>> > LvqBenchV on GCC: 0.991s
>>> > LvqBenchNV on MSC: 2.39s
>>> > LvqBenchV on MSC: 1.64s
>>> >
>>> > Locally patched with some EIGEN_STRONG_INLINE for MSC; Before changeset
>>> > 4285
>>> > (25ba289d5292) Bug 363 - check for integer overflow in byte-size
>>> > computations:
>>> > LvqBenchNV on GCC: 1.21s
>>> > LvqBenchV on GCC: 0.991s
>>> > LvqBenchNV on MSC: 1.75s
>>> > LvqBenchV on MSC: 1.35s
>>> >
>>> > After changeset 4309 (93b090532ed2) Mention that the axis in AngleAxis
>>> > have
>>> > to be normalized.:
>>> > LvqBenchNV on GCC: 1.53s
>>> > LvqBenchV on GCC: 1.41s
>>> > LvqBenchNV on MSC: 2.42s
>>> > LvqBenchV on MSC: 1.74s
>>> >
>>> > Locally patched with some EIGEN_STRONG_INLINE for MSC; After changeset
>>> > 4309
>>> > (93b090532ed2) Mention that the axis in AngleAxis have to be
>>> > normalized.:
>>> > LvqBenchNV on GCC: 1.52s
>>> > LvqBenchV on GCC: 1.41s
>>> > LvqBenchNV on MSC: 1.97s
>>> > LvqBenchV on MSC: 1.64s
>>> >
>>> > This represents a 42% slowdown for the fastest gcc (4.6.2 svn) results
>>> > and a
>>> > 21% slowdown for the fastest MSC results.
>>> >
>>> > Since the benchmark mostly consists of simple muls and adds on matrices
>>> > of
>>> > size Nx1, 2x1, 2x2 and 2xN, each individual operation is just a few
>>> > floating
>>> > point operations and thus overhead is very relevant.  I'm guessing this
>>> > is
>>> > due to the bug 363 related checkins; the extra checking is possibly
>>> > hamping
>>> > inlining and possibly just represents a significant number of
>>> > operations in
>>> > relation to the otherwise cheap matrix ops.  The slow version has the
>>> > inline
>>> > for check_rows_cols_for_overflow already, so that's not it.
>>> >
>>> > Perhaps a flag disabling all checking would be nice; especially if
>>> > there are
>>> > any other such checks which might be removed...
>>> >
>>> > --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>> >
>>>
>>>
>>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/