|Re: [eigen] request for help: 4x4 matrix inverse|
[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]
Thanks a lot for this! What is needed now is to adapt your code to Eigen's ei_p... wrappers so that it also works with SSE (which I hope is possible at all). Also, it is perfectly possible to rework the ei_p... functions at this occasion. The current 4x4 inversion code in Eigen doesn't vectorize very well, for some reasons: 2x2 floats matrix ops aren't vectorized (this would require a notion of seeing a matrix op as a vector op when possible, so here the size of the op would be 4 and 4*sizeof(float)==16 which is vectorizable) and (less importantly here) block xprs are also not vectorized. Also, for some reason I can't explain, with doubles, 4x4 matrix inversion is significantly slower with SSE enabled. Maybe it is worth trying to fix Eigen's generic vectorization before trying to import your hand-optimized code. > Though I haven't tested it, I believe that the matrix partitioning > method is faster overall, since the calculations are fewer than the > coefficients method. But as I said, I intend to compare these soon. Following our discussion on IRC: after all I think that you are indeed right that it can be beneficial also for large matrices. Even though it doesn't affect the order of complexity (replacing an O(n^3) op by 8 O(k^3) ops with k=n/2), it still reduces the constants in front of this big-O's. Indeed, it replaces 6 inversions by products, and I am willing to believe that a product is less costly than an inversion. So yes it would be beneficial, but personnally I'll concentrate on getting 2.0 out the door before I think about this. Cheers, Benoit
Description: This is a digitally signed message part.
|Mail converted by MHonArc 2.6.19+||http://listengine.tuxfamily.org/|