Re: [eigen] request for help: 4x4 matrix inverse

[ Thread Index | Date Index | More Archives ]

Thanks a lot for this!

What is needed now is to adapt your code to Eigen's ei_p... wrappers so that 
it also works with SSE (which I hope is possible at all). Also, it is 
perfectly possible to rework the ei_p... functions at this occasion.

The current 4x4 inversion code in Eigen doesn't vectorize very well, for some 
reasons: 2x2 floats matrix ops aren't vectorized (this would require a notion 
of seeing a matrix op as a vector op when possible, so here the size of the 
op would be 4 and 4*sizeof(float)==16 which is vectorizable) and (less 
importantly here) block xprs are also not vectorized. Also, for some reason I 
can't explain, with doubles, 4x4 matrix inversion is significantly slower 
with SSE enabled.

Maybe it is worth trying to fix Eigen's generic vectorization before trying to 
import your hand-optimized code.

> Though I haven't tested it, I believe that the matrix partitioning
> method is faster overall, since the calculations are fewer than the
> coefficients method. But as I said, I intend to compare these soon.

Following our discussion on IRC: after all I think that you are indeed right 
that it can be beneficial also for large matrices. Even though it doesn't 
affect the order of complexity (replacing an O(n^3) op by 8 O(k^3) ops with 
k=n/2), it still reduces the constants in front of this big-O's. Indeed, it 
replaces 6 inversions by products, and I am willing to believe that a product 
is less costly than an inversion.

So yes it would be beneficial, but personnally I'll concentrate on getting 2.0 
out the door before I think about this.



Attachment: signature.asc
Description: This is a digitally signed message part.

Mail converted by MHonArc 2.6.19+