On a benchmark I use to check the performance of a simple gradient descent algorithm, I've noticesd large performance degradation over the last month.

NV benchmark results use eigen without vectorization, V results are with vectorization. The timings represent best-of-several runs and exhibit essentially no variation.

Before changeset 4285 (25ba289d5292) Bug 363 - check for integer overflow in byte-size computations:

LvqBenchNV on GCC: 1.22s

LvqBenchV on GCC: 0.991s

LvqBenchNV on MSC: 2.39s

LvqBenchV on MSC: 1.64s

Locally patched with some EIGEN_STRONG_INLINE for MSC; Before changeset 4285 (25ba289d5292) Bug 363 - check for integer overflow in byte-size computations:

LvqBenchNV on GCC: 1.21s

LvqBenchV on GCC: 0.991s

LvqBenchNV on MSC: 1.75s

LvqBenchV on MSC: 1.35s

After changeset 4309 (93b090532ed2) Mention that the axis in AngleAxis have to be normalized.:

LvqBenchNV on GCC: 1.53s

LvqBenchV on GCC: 1.41s

LvqBenchNV on MSC: 2.42s

LvqBenchV on MSC: 1.74s

Locally patched with some EIGEN_STRONG_INLINE for MSC; After changeset 4309 (93b090532ed2) Mention that the axis in AngleAxis have to be normalized.:

LvqBenchNV on GCC: 1.52s

LvqBenchV on GCC: 1.41s

LvqBenchNV on MSC: 1.97s

LvqBenchV on MSC: 1.64s

This represents a 42% slowdown for the fastest gcc (4.6.2 svn) results and a 21% slowdown for the fastest MSC results.

Since the benchmark mostly consists of simple muls and adds on matrices of size Nx1, 2x1, 2x2 and 2xN, each individual operation is just a few floating point operations and thus overhead is very relevant. I'm guessing this is due to the bug 363 related checkins; the extra checking is possibly hamping inlining and possibly just represents a significant number of operations in relation to the otherwise cheap matrix ops. The slow version has the inline for check_rows_cols_for_overflow already, so that's not it.

Perhaps a flag disabling all checking would be nice; especially if there are any other such checks which might be removed...

