Re: [eigen] Significant perf regression probably due to bug 363 patches |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Significant perf regression probably due to bug 363 patches
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Sat, 5 Nov 2011 23:48:48 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=lAwXO5Vs9IHGHD9bJ7L3Wu1JsF1Yc2y7v5gLuHqADgA=; b=jdqdvhTijIbxfLHq9Sv7pAvRniDgceg11NWqZM5TURt34hWkrUKj2oW/2LxD8eChUR pb8rgwF8Bg44efT5fDnWoEZ7vo/IcqVBiurIW7vyjHJuHijeqNN7VA8FmiBh0z5IsIIw fj+MA4HW91Il4dfbcdp28CbyYzU3IRtYunPrY=
There are some issue with the dimensions of your matrices, I get the
following assertion from this line: GgmLvqBench.cpp:386
Assertion failed: (rows >= 0 && (RowsAtCompileTime == Dynamic ||
RowsAtCompileTime == rows) && cols >= 0 && (ColsAtCompileTime ==
Dynamic || ColsAtCompileTime == cols)), function _init2, file
.../../eigen3.0/Eigen/src/Core/PlainObjectBase.h, line 600
gael
On Sat, Nov 5, 2011 at 5:57 PM, Eamon Nerbonne <eamon@xxxxxxxxxxxx> wrote:
> Sorry, disregard the previous attachment, you need this one...
>
> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>
>
> On Sat, Nov 5, 2011 at 17:51, Eamon Nerbonne <eamon@xxxxxxxxxxxx> wrote:
>>
>> OK, I've taken a small portion of the overall benchmark that shows
>> particularly much degradation and moved it and all its dependancies into a
>> single source file for easy use. I've noticed this particular bit of code
>> uses Map<> to avoid resizing checks, perhaps that's where the inlining is
>> going wrong?
>>
>>
>> The benchmark is a stochastic gradient descent (trying to find a
>> discriminative linear dimension reduction). You can control the datatype
>> (float/double) and the low-dimension dimensionality by preprocessor
>> directive i.e. on the compiler command line. It uses mt19937 to generate a
>> sample dataset, and doing that using boost is nicest since that results in
>> the same dataset across compilers, but if you don't want to, you can use the
>> TR1/C++0x implementation by the compiler instead (toggles via preprocessor
>> directive).
>>
>> Example usage: g++ GgmLvqBench.cpp -std=c++0x -DNDEBUG -DLVQFLOAT=float
>> -DLVQ_LOW_DIM_SPACE=4 -O3 -march=native
>>
>> When run, the best timing of 10 runs is written to standard out, and 4
>> error rates for each of the 10 runs are written to standard error; the 4
>> error rates represent accuracies during training and should generally
>> decrease.
>> You can define EIGEN_DONT_VECTORIZE to disable eigen's vectorization
>> You can define NO_BOOST to use the mt19937 implementation provided by the
>> compiler and not boost's, however the generated dataset may differ and this
>> may (very slightly) affect performance
>> You can define LVQFLOAT as float or double; computations will use that
>> type; (default: double)
>> You can define LVQ_LOW_DIM_SPACE as some fixed number in range [2..19]
>> (default:2) which controls the number of dimensions the algorithm will work
>> in. Other positive numbers might work too.
>>
>>
>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>
>>
>> On Fri, Nov 4, 2011 at 12:57, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>> wrote:
>>>
>>> Hi Eamon
>>>
>>> This is very concerning; can you share your benchmark so that I can
>>> try myself? I would like to understand why it's slow.
>>>
>>> Thanks,
>>> Benoit
>>>
>>> 2011/11/4 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
>>> > On a benchmark I use to check the performance of a simple gradient
>>> > descent
>>> > algorithm, I've noticesd large performance degradation over the last
>>> > month.
>>> >
>>> > NV benchmark results use eigen without vectorization, V results are
>>> > with
>>> > vectorization. The timings represent best-of-several runs and exhibit
>>> > essentially no variation.
>>> >
>>> > Before changeset 4285 (25ba289d5292) Bug 363 - check for integer
>>> > overflow in
>>> > byte-size computations:
>>> > LvqBenchNV on GCC: 1.22s
>>> > LvqBenchV on GCC: 0.991s
>>> > LvqBenchNV on MSC: 2.39s
>>> > LvqBenchV on MSC: 1.64s
>>> >
>>> > Locally patched with some EIGEN_STRONG_INLINE for MSC; Before changeset
>>> > 4285
>>> > (25ba289d5292) Bug 363 - check for integer overflow in byte-size
>>> > computations:
>>> > LvqBenchNV on GCC: 1.21s
>>> > LvqBenchV on GCC: 0.991s
>>> > LvqBenchNV on MSC: 1.75s
>>> > LvqBenchV on MSC: 1.35s
>>> >
>>> > After changeset 4309 (93b090532ed2) Mention that the axis in AngleAxis
>>> > have
>>> > to be normalized.:
>>> > LvqBenchNV on GCC: 1.53s
>>> > LvqBenchV on GCC: 1.41s
>>> > LvqBenchNV on MSC: 2.42s
>>> > LvqBenchV on MSC: 1.74s
>>> >
>>> > Locally patched with some EIGEN_STRONG_INLINE for MSC; After changeset
>>> > 4309
>>> > (93b090532ed2) Mention that the axis in AngleAxis have to be
>>> > normalized.:
>>> > LvqBenchNV on GCC: 1.52s
>>> > LvqBenchV on GCC: 1.41s
>>> > LvqBenchNV on MSC: 1.97s
>>> > LvqBenchV on MSC: 1.64s
>>> >
>>> > This represents a 42% slowdown for the fastest gcc (4.6.2 svn) results
>>> > and a
>>> > 21% slowdown for the fastest MSC results.
>>> >
>>> > Since the benchmark mostly consists of simple muls and adds on matrices
>>> > of
>>> > size Nx1, 2x1, 2x2 and 2xN, each individual operation is just a few
>>> > floating
>>> > point operations and thus overhead is very relevant. I'm guessing this
>>> > is
>>> > due to the bug 363 related checkins; the extra checking is possibly
>>> > hamping
>>> > inlining and possibly just represents a significant number of
>>> > operations in
>>> > relation to the otherwise cheap matrix ops. The slow version has the
>>> > inline
>>> > for check_rows_cols_for_overflow already, so that's not it.
>>> >
>>> > Perhaps a flag disabling all checking would be nice; especially if
>>> > there are
>>> > any other such checks which might be removed...
>>> >
>>> > --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>> >
>>>
>>>
>>
>
>