Currently Ceres runs a QR decomposition for dense nonlinear solving in the heart of the Levenberg-Marquardt loop is as follows:
VectorRef(x, num_cols) = A->matrix().colPivHouseholderQr().solve(rhs_);
This is the performance limiting step in Ceres for dense problems. The A matrix tends to be very tall and not especially wide; in other words, it is an overdetermined system. You might ask why not use Cholesky, and the answer is that forming the A'A matrix is sufficiently expensive to make running Cholesky slower than QR.
Any ideas on how to make this solve faster than it is currently?
Thanks!
Keir