Re: [eigen] cache-friendly matrix inverse

[ Thread Index | Date Index | More Archives ]

2009/5/14 Jitse Niesen <jitse@xxxxxxxxxxxxxxxxx>:
> I tested anyway. Indeed the inverse is massively faster for small matrices.
> But it's also massively faster for small dynamic-size matrices (there is no
> loop unrolling in this case, is there?). And it's still clearly faster for
> 100 x 100 matrices.

Indeed, for dynamic sizes, there's no unrolling, but there's another
issue: cache-friendliness. Gael made a very good (actually, according
to the benchmarks on the wiki, the best all-around) cache-friendly
matrix-vector product implementation. This becomes increasingly
important as the matrix size increases. By contrast, our triangular
solver does not yet have comparable optimizations (and, according to
our benchmarks, neither do other libraries). So again the benchmarking
is skewed.

> One important caveat: I did not test for accuracy. Using the LU
> decomposition is supposed to be more stable.

I agree, and i didn't dispute that, precomputing the inverse and then
multiplying by it is accumulating more inaccuracy than just using the
LU to solve once.

> I can see two possibilities. Either the perceived wisdom in numerical
> analysis is wrong. Or there is a lot of scope to optimize LU solve.

It's the 2nd: there is a lot of scope to optimize LU solve.
However, doing so is more difficult than optimizing matrix-vector
product, and won't give quite as good results because dealing with the
triangular matrix is rather painful from the point of view of packet
alignment; so the difference may become negligible for large enough
matrices but on the other hand, for small enough matrices, the
matrix-vector product will remain faster; and i still have no idea
what the crossover point will be.


Mail converted by MHonArc 2.6.19+