2009/9/16 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>: > wow I did not know FullPivotingHouseholderQR was so heavy to compile ! > > So I tried to find the compilation bottleneck and fyi the most > expensive lines are clearly the 2 calls to applyHouseholderOnTheLeft > in compute() and matrixQ(). Thanks for checking this, > > Actually, we can speedup a bit the compilation of applyHouseholder* if > we temporarily overwrite the first coeff of the vector by 1. This > strategy requires that the essential part is preceded by a valid > scalar value, that is currently always the case. currently we have: - a matrix-vector product - two level 1 operations - an outer product I understand that one can get rid of the two level 1 ops, but I wonder if it's worth it. > Then applyHouseholder boils down to a matrix-vector product. Since the > latter is currently instantiated for each different type of the result > (Matrix / Block / VectorBlock), one possibility to further amortize > the compilation of applyHouseholder is to make the matrix-vector > product routine depends only on the scalar type (and storage order). This is a great idea in general! But I did the following. First I put EIGEN_DONT_INLINE in front of the applyHouse* methods. Then I compiled a.cpp only with MatrixXf, commenting out MatrixXd. === 23:23:11 ~$ nm a | grep applyHouse | wc -l 1 This says that only one instantiation of applyHouse* is compiled, so it itself instantiates only one matrix-vector product anyway, so the proposed change would not have any impact in this particular situation. Trying to understand what causes the QR compilation to be so slow, I generated symbol tables, === 23:36:27 ~$ g++ a.cpp -o a_square -I eigen2 -msse2 -O2 -DSVDOPTIONS=Square === 23:38:35 ~$ nm --print-size --size-sort a_square | grep Eigen | tee symbols_a_square | wc -l 38 === 23:38:38 ~$ g++ a.cpp -o a -I eigen2 -msse2 -O2 === 23:38:50 ~$ nm --print-size --size-sort a | grep Eigen | tee symbols_a | wc -l 118 So we see that doing the QR multiplies the number of Eigen symbols by 3 !! See attached files, if you can make sense of that... i'm puzzled. The matrix-vector product symbol itself takes 4 kilobytes, so it's not explaining all the difference. Benoit

