[eigen] benchmarking weirdness |
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
Hi List A lot of progress has happened since alpha1 -- much more than I expected to remain to be done. I'll write more about this later, but now I would like to discuss benchmarking. We now have two benchmarks in doc/ : benchmark.cpp is our traditional benchmark on 3x3 fixed-size matrices, and benchmarkX.cpp is a 20x20 dynamic size variant. There is also a script, benchmark_suite, running these benchmarks several times with various compile options: *with and without -DNDEBUG (disabling asserts) *with matrix storage order set to RowMajor and ColumnMajor I should insist on the fact that the matrix storage order influences not only the storage of coefficients, but also the traversal order when e.g. copying matrices. Expressions are recursively aware of the preferred traversal order. The reason why I'm writing this is that this benchmark_suite gives me some very unexpected results: gaston@kiwi:~/cuisine/branches/work/eigen2/doc$ g++ --version g++ (GCC) 4.2.1 (Ubuntu 4.2.1-5ubuntu4) gaston@kiwi:~/cuisine/branches/work/eigen2/doc$ ./benchmark_suite Fixed size 3x3, ColumnMajor, -DNDEBUG real 0m19.942s user 0m19.893s sys 0m0.024s Fixed size 3x3, ColumnMajor, with asserts real 0m32.434s user 0m32.406s sys 0m0.008s Fixed size 3x3, RowMajor, -DNDEBUG real 0m21.497s user 0m21.497s sys 0m0.000s Fixed size 3x3, RowMajor, with asserts real 0m32.133s user 0m32.122s sys 0m0.012s Dynamic size 20x20, ColumnMajor, -DNDEBUG real 0m33.014s user 0m33.006s sys 0m0.000s Dynamic size 20x20, ColumnMajor, with asserts real 0m27.599s user 0m27.554s sys 0m0.024s Dynamic size 20x20, RowMajor, -DNDEBUG real 0m28.343s user 0m28.342s sys 0m0.000s Dynamic size 20x20, RowMajor, with asserts real 0m26.597s user 0m26.562s sys 0m0.012s We see two strange things here, which I can't explain. First, with dynamicsize 20x20, disabling asserts (defining NDEBUG) REDUCES speed! What's going on? First, the storage order has a nonnegligible impact. More precisely, with 3x3 fixedsize, ColumnMajor is almost 10% faster than RowMajor, while with 20x20 dynamicsize, RowMajor is faster than ColumnMajor! Also, how to explain the fact that RowMajor suffers less than ColumnMajor from the slowdown induced by defining NDEBUG ? All this is in SVN so please help me! Cheers, Benoit
Attachment:
signature.asc
Description: This is a digitally signed message part.
Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |