Re: [eigen] GivensQR

[ Thread Index | Date Index | More Archives ]

Andrea: OK, sorry for the long delay in dealing with your branch. I
really want to merge it in the best way, the problem is that there is
a fundamental design question -- whether to compute Q simultaneously
with the decomp, or to store Givens rotations for later usage as you
do in your current code -- that I still amn't sure about.

Earlier in this thread I argued that storing the Givens rotations was
not a good idea for performance. That's true for large enough sizes,
but not for small ones. Precisely, GivensQR is going to be mostly
useful for small sizes. So I'm not sure anymore about my performance

Likewise, the argument that storing the Givens rotations takes more
memory isn't important for small sizes.

Finally, my argument that a vector solve is best done by appending the
vector to the right of the matrix, while correct (i still think it's
optimal for performance) has a little flaw: it's not very convenient
API-wise and most people won't even think about it.

For these reasons I'd like to apologize for arguing against your code
when actually it made a lot of sense. It just wasn't clear to me at
all, next time if you have good arguments don't hesitate to speak

I still need to think about various issues, but eventually i'm merging
GivensQR with the "store givens rotations" idea preserved in one form
or another.

Just one thing: this will be a class squarely aimed at small matrices,
not good at all for large matrices. I guess that was clear anyway.


2009/8/19 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> Thank you!
> I was intrigued by the option of computing/storing only the sequence
> of Givens rotations, which doesn't seem like common practice at all; i
> was expecting a GivensQR class to compute/store Q and/or R according
> to what the user asked for.
> So I wondered what was the use case for the approach of storing the
> sequence of Givens rotations, and it seems to me that the best use
> case for it should be when doing only 1 vector solve. Indeed, in that
> case, avoiding the overhead of computing Q or R should be the most
> beneficial.
> So i did a benchmark (attached).
> Result (times in seconds, lower is better):
> === 09:28:49 ~$ g++ bench.cpp -o bench -O2 -DNDEBUG -I eigen2 && ./bench
> *** size 3 ***
> GivensQR:  0.751133 sec
> PartialLU: 0.542512 sec
> LU:        0.8757 sec
> *** size 4 ***
> GivensQR:  0.689996 sec
> PartialLU: 0.452786 sec
> LU:        0.702428 sec
> *** size 20 ***
> GivensQR:  1.79185 sec
> PartialLU: 0.778664 sec
> LU:        1.0921 sec
> *** size 200 ***
> GivensQR:  11.1207 sec
> PartialLU: 2.26197 sec
> LU:        4.75969 sec
> Note that this is without vectorization, indeed there's room for
> improvement of vectorization in your code, so I didn't want to make
> that influence the result. As expected, enabling vectorization only
> widens the difference.
> The bottom line is that even on the most favorable use case, a single
> vector solve, GivensQR is never really faster than the full-pivoting
> LU (the biggest is a 15% difference for 3x3 size) and gets much slower
> for larger sizes.
> So I'm tempted not to keep that aspect of GivensQR where only the
> sequence of rotations is computed. Instead, I would like to harmonize
> GivensQR with the other decs by making it templated in parameters
> allowing the user to say if they want Q and/or R, storing Q as a plain
> matrix (a packed storage of the cosines would require taking a lot of
> sqrts for getting the sines) and, like in other decompositions, making
> the decomposition actually happen at construction time (the
> constructor GivensQR(matrix) calls a compute() method).
> Are you OK? Am I missing something?
> Thanks,
> Benoit
> 2009/8/19 Andrea Arteaga <yo.eres@xxxxxxxxx>:
>> I (re-)created the givensqr fork. So I committed the file I posted yesterday
>> and some small modifications. Link: .
>> I added bjacob as administrator.

Mail converted by MHonArc 2.6.19+