Re: [eigen] GivensQR

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


and for completeness, here is the benchmark against also HouseholderQR:

=== 09:31:15 ~$ g++ bench.cpp -o bench -O2 -DNDEBUG -I eigen2 && ./bench
*** size 3 ***
GivensQR:  0.755026 sec
HouseholderQR:0.921341 sec
PartialLU: 0.484068 sec
LU:        0.868217 sec
*** size 4 ***
GivensQR:  0.700629 sec
HouseholderQR:0.695371 sec
PartialLU: 0.452713 sec
LU:        0.672086 sec
*** size 20 ***
GivensQR:  1.80749 sec
HouseholderQR:0.77241 sec
PartialLU: 0.782401 sec
LU:        1.06504 sec
*** size 200 ***
GivensQR:  11.4972 sec
HouseholderQR:3.41593 sec
PartialLU: 2.26986 sec
LU:        4.75654 sec


2009/8/19 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> Thank you!
>
> I was intrigued by the option of computing/storing only the sequence
> of Givens rotations, which doesn't seem like common practice at all; i
> was expecting a GivensQR class to compute/store Q and/or R according
> to what the user asked for.
>
> So I wondered what was the use case for the approach of storing the
> sequence of Givens rotations, and it seems to me that the best use
> case for it should be when doing only 1 vector solve. Indeed, in that
> case, avoiding the overhead of computing Q or R should be the most
> beneficial.
>
> So i did a benchmark (attached).
>
> Result (times in seconds, lower is better):
>
> === 09:28:49 ~$ g++ bench.cpp -o bench -O2 -DNDEBUG -I eigen2 && ./bench
> *** size 3 ***
> GivensQR:  0.751133 sec
> PartialLU: 0.542512 sec
> LU:        0.8757 sec
> *** size 4 ***
> GivensQR:  0.689996 sec
> PartialLU: 0.452786 sec
> LU:        0.702428 sec
> *** size 20 ***
> GivensQR:  1.79185 sec
> PartialLU: 0.778664 sec
> LU:        1.0921 sec
> *** size 200 ***
> GivensQR:  11.1207 sec
> PartialLU: 2.26197 sec
> LU:        4.75969 sec
>
>
> Note that this is without vectorization, indeed there's room for
> improvement of vectorization in your code, so I didn't want to make
> that influence the result. As expected, enabling vectorization only
> widens the difference.
>
> The bottom line is that even on the most favorable use case, a single
> vector solve, GivensQR is never really faster than the full-pivoting
> LU (the biggest is a 15% difference for 3x3 size) and gets much slower
> for larger sizes.
>
> So I'm tempted not to keep that aspect of GivensQR where only the
> sequence of rotations is computed. Instead, I would like to harmonize
> GivensQR with the other decs by making it templated in parameters
> allowing the user to say if they want Q and/or R, storing Q as a plain
> matrix (a packed storage of the cosines would require taking a lot of
> sqrts for getting the sines) and, like in other decompositions, making
> the decomposition actually happen at construction time (the
> constructor GivensQR(matrix) calls a compute() method).
>
> Are you OK? Am I missing something?
>
> Thanks,
> Benoit
>
>
>
> 2009/8/19 Andrea Arteaga <yo.eres@xxxxxxxxx>:
>> I (re-)created the givensqr fork. So I committed the file I posted yesterday
>> and some small modifications. Link: https://bitbucket.org/spiros/givensqr/ .
>>
>> I added bjacob as administrator.
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/