Re: [eigen] Significant perf regression probably due to bug 363 patches |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Significant perf regression probably due to bug 363 patches
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Sat, 5 Nov 2011 23:45:49 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=xxa1gxpZ2oPgju9zesVNTwfTLYMQWyNdCp7s1eG8t/8=; b=q9mwgS5KQWQ8xdmX+vqqhubVfdQTKi1NFENydU6Q2rDGlD/WyJTd36Dz2IR3Swnopd G0mGbpM217ui8NOuJ0KtH6UHE7lteM6E2yz/d7Gk+Ia42aodhv7BoWCc0njNVCrlsr45 rgh5d0Z2hdN8W+zrOErlNjwKfF10HK0ZTKyao=
Once I fix this, using this command line for debugging,
$ g++ Downloads/GgmLvqBench.cpp -std=c++0x -DLVQFLOAT=float
-DLVQ_LOW_DIM_SPACE=4 -g3 -march=native -I eigen -o b -lrt
I get the same assert failure as Gael:
constructing with 4 dimensions of floats(V): 0.013696s
Initial err: 0.2625, cost: -0.266823b:
eigen/Eigen/src/Core/PlainObjectBase.h:600: void
Eigen::PlainObjectBase<Derived>::_init2(Eigen::PlainObjectBase<Derived>::Index,
Eigen::PlainObjectBase<Derived>::Index, typename
Eigen::internal::enable_if<(Eigen::PlainObjectBase<Derived>::Base::
SizeAtCompileTime != 2), T0>::type*) [with T0 = float, T1 = float,
Derived = Eigen::Matrix<float, 4, 1>,
Eigen::PlainObjectBase<Derived>::Index = long int, typename
Eigen::internal::enable_if<(Eigen::PlainObjectBase<Derived>::Base::
SizeAtCompileTime != 2), T0>::type = float]: Assertion `rows >= 0 &&
(RowsAtCompileTime == Dynamic || RowsAtCompileTime == rows) && cols >=
0 && (ColsAtCompileTime == Dynamic || ColsAtCompileTime == cols)'
failed.
Program received signal SIGABRT, Aborted.
0x00007ffff70e1405 in *__GI_raise (sig=<optimized out>)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0 0x00007ffff70e1405 in *__GI_raise (sig=<optimized out>)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007ffff70e4680 in *__GI_abort () at abort.c:92
#2 0x00007ffff70da5b1 in *__GI___assert_fail (
assertion=0x470260 "rows >= 0 && (RowsAtCompileTime == Dynamic ||
RowsAtCompileTime == rows) && cols >= 0 && (ColsAtCompileTime ==
Dynamic || ColsAtCompileTime == cols)",
file=<optimized out>, line=600,
function=0x491620 "void
Eigen::PlainObjectBase<Derived>::_init2(Eigen::PlainObjectBase<Derived>::Index,
Eigen::PlainObjectBase<Derived>::Index, typename
Eigen::internal::enable_if<(Eigen::PlainObjectBase<Derived>::Base:"...)
at assert.c:81
#3 0x000000000041097f in Eigen::PlainObjectBase<Eigen::Matrix<float,
4, 1, 0, 4, 1> >::_init2<float, float> (this=0x7fffffffa1d0, rows=-6,
cols=-1)
at eigen/Eigen/src/Core/PlainObjectBase.h:599
#4 0x000000000040b68c in Eigen::Matrix<float, 4, 1, 0, 4,
1>::Matrix<float, float> (
this=0x7fffffffa1d0, x=@0x7fffffffa144, y=@0x7fffffffa140)
at eigen/Eigen/src/Core/Matrix.h:252
#5 0x0000000000407151 in GgmLvqModel::ClassBoundaryDiagram
(this=0x7fffffffa540,
x0=-6.54985905, x1=6.89831448, y0=-1.69166601, y1=2.17467284,
classDiagram=...)
at Downloads/GgmLvqBench.cpp:386
#6 0x00000000004023b8 in PrintModelStatus (label=0x46ff8f "Initial",
model=..., points=...,
labels=...) at Downloads/GgmLvqBench.cpp:551
#7 0x00000000004028f6 in TestModel (shuffleRand=..., points=...,
labels=..., protosPerClass=2,
iters=10) at Downloads/GgmLvqBench.cpp:592
#8 0x0000000000403071 in EasyLvqTest () at Downloads/GgmLvqBench.cpp:680
#9 0x0000000000403233 in main (argv=0x7fffffffe2e8) at
Downloads/GgmLvqBench.cpp:695
Something really evil is going on here. In frame 5, you're trying to
construct a Vector_L, which here means Vector4f, from 2 floats. Since
this is not a 2D vector type (like Vector2f) but a 4D vector type,
Eigen correctly decides to NOT interprete the 2 float parameters as x
and y coordinates of your vector, and instead decides (in frame 4) to
cast these 2 float params to int and interprete them as (rows, cols)
dimensions for the general matrix constructor.
So this is really a bug in your code; did this ever work? Meanwhile,
we should try to generate an error at compile time in this case. This
should be easy since x and y are of templated type so we get to know
(in frame 3) that they're floats which doesn't make sense for (rows,
cols) dimensions.
Benoit
2011/11/5 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> $ g++ Downloads/GgmLvqBench.cpp -o b -I eigen -O3
> Downloads/GgmLvqBench.cpp:30:29: fatal error: Eigen/EigenValues: No
> such file or directory
> compilation terminated
>
> Eigenvalues, not EigenValues in current Eigen.
>
> Did this compile for you?
>
> Benoit
>
> 2011/11/5 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
>> Sorry, disregard the previous attachment, you need this one...
>>
>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>
>>
>> On Sat, Nov 5, 2011 at 17:51, Eamon Nerbonne <eamon@xxxxxxxxxxxx> wrote:
>>>
>>> OK, I've taken a small portion of the overall benchmark that shows
>>> particularly much degradation and moved it and all its dependancies into a
>>> single source file for easy use. I've noticed this particular bit of code
>>> uses Map<> to avoid resizing checks, perhaps that's where the inlining is
>>> going wrong?
>>>
>>>
>>> The benchmark is a stochastic gradient descent (trying to find a
>>> discriminative linear dimension reduction). You can control the datatype
>>> (float/double) and the low-dimension dimensionality by preprocessor
>>> directive i.e. on the compiler command line. It uses mt19937 to generate a
>>> sample dataset, and doing that using boost is nicest since that results in
>>> the same dataset across compilers, but if you don't want to, you can use the
>>> TR1/C++0x implementation by the compiler instead (toggles via preprocessor
>>> directive).
>>>
>>> Example usage: g++ GgmLvqBench.cpp -std=c++0x -DNDEBUG -DLVQFLOAT=float
>>> -DLVQ_LOW_DIM_SPACE=4 -O3 -march=native
>>>
>>> When run, the best timing of 10 runs is written to standard out, and 4
>>> error rates for each of the 10 runs are written to standard error; the 4
>>> error rates represent accuracies during training and should generally
>>> decrease.
>>> You can define EIGEN_DONT_VECTORIZE to disable eigen's vectorization
>>> You can define NO_BOOST to use the mt19937 implementation provided by the
>>> compiler and not boost's, however the generated dataset may differ and this
>>> may (very slightly) affect performance
>>> You can define LVQFLOAT as float or double; computations will use that
>>> type; (default: double)
>>> You can define LVQ_LOW_DIM_SPACE as some fixed number in range [2..19]
>>> (default:2) which controls the number of dimensions the algorithm will work
>>> in. Other positive numbers might work too.
>>>
>>>
>>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>>
>>>
>>> On Fri, Nov 4, 2011 at 12:57, Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
>>> wrote:
>>>>
>>>> Hi Eamon
>>>>
>>>> This is very concerning; can you share your benchmark so that I can
>>>> try myself? I would like to understand why it's slow.
>>>>
>>>> Thanks,
>>>> Benoit
>>>>
>>>> 2011/11/4 Eamon Nerbonne <eamon@xxxxxxxxxxxx>:
>>>> > On a benchmark I use to check the performance of a simple gradient
>>>> > descent
>>>> > algorithm, I've noticesd large performance degradation over the last
>>>> > month.
>>>> >
>>>> > NV benchmark results use eigen without vectorization, V results are
>>>> > with
>>>> > vectorization. The timings represent best-of-several runs and exhibit
>>>> > essentially no variation.
>>>> >
>>>> > Before changeset 4285 (25ba289d5292) Bug 363 - check for integer
>>>> > overflow in
>>>> > byte-size computations:
>>>> > LvqBenchNV on GCC: 1.22s
>>>> > LvqBenchV on GCC: 0.991s
>>>> > LvqBenchNV on MSC: 2.39s
>>>> > LvqBenchV on MSC: 1.64s
>>>> >
>>>> > Locally patched with some EIGEN_STRONG_INLINE for MSC; Before changeset
>>>> > 4285
>>>> > (25ba289d5292) Bug 363 - check for integer overflow in byte-size
>>>> > computations:
>>>> > LvqBenchNV on GCC: 1.21s
>>>> > LvqBenchV on GCC: 0.991s
>>>> > LvqBenchNV on MSC: 1.75s
>>>> > LvqBenchV on MSC: 1.35s
>>>> >
>>>> > After changeset 4309 (93b090532ed2) Mention that the axis in AngleAxis
>>>> > have
>>>> > to be normalized.:
>>>> > LvqBenchNV on GCC: 1.53s
>>>> > LvqBenchV on GCC: 1.41s
>>>> > LvqBenchNV on MSC: 2.42s
>>>> > LvqBenchV on MSC: 1.74s
>>>> >
>>>> > Locally patched with some EIGEN_STRONG_INLINE for MSC; After changeset
>>>> > 4309
>>>> > (93b090532ed2) Mention that the axis in AngleAxis have to be
>>>> > normalized.:
>>>> > LvqBenchNV on GCC: 1.52s
>>>> > LvqBenchV on GCC: 1.41s
>>>> > LvqBenchNV on MSC: 1.97s
>>>> > LvqBenchV on MSC: 1.64s
>>>> >
>>>> > This represents a 42% slowdown for the fastest gcc (4.6.2 svn) results
>>>> > and a
>>>> > 21% slowdown for the fastest MSC results.
>>>> >
>>>> > Since the benchmark mostly consists of simple muls and adds on matrices
>>>> > of
>>>> > size Nx1, 2x1, 2x2 and 2xN, each individual operation is just a few
>>>> > floating
>>>> > point operations and thus overhead is very relevant. I'm guessing this
>>>> > is
>>>> > due to the bug 363 related checkins; the extra checking is possibly
>>>> > hamping
>>>> > inlining and possibly just represents a significant number of
>>>> > operations in
>>>> > relation to the otherwise cheap matrix ops. The slow version has the
>>>> > inline
>>>> > for check_rows_cols_for_overflow already, so that's not it.
>>>> >
>>>> > Perhaps a flag disabling all checking would be nice; especially if
>>>> > there are
>>>> > any other such checks which might be removed...
>>>> >
>>>> > --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>>> >
>>>>
>>>>
>>>
>>
>>
>