[eigen] Matrix product between some MatrixType and triangularView<UnitLower> |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
Hi,
I fixed the issue I had that prevent me from getting a working
implementation of QR Householder decomposition, and apply Householder on
the left/right function.
However, Im' currently facing another issue : my implementation is slower
than the unblocked version of theses functions, because I'm relying on the
"triangularView" template.
As I understand, matrix.triangularView() does not copy the matrix object,
and provides some comfort. However, the following
applyBlockHouseholderOnTheLeft function does trigger a segfault :
template<typename Derived, typename OtherDerived = Matrix<
typename Derived::Scalar, Derived::ColsAtCompileTime, Dynamic> >
class HHutilities
{
public:
typedef Matrix<typename Derived::Scalar, Derived::ColsAtCompileTime,
Derived::ColsAtCompileTime> TriangularFactorType;
typedef typename ei_plain_diag_type<Derived>::type CoeffVect;
TriangularFactorType
static
BlockHouseholderTriangularFactor(const Derived& V,
const CoeffVect& coeffs,const Derived& VV)
{
const int k = V.cols();
ei_assert(V.cols()==k && V.rows()>=k && "Vector storing Matrix
must have same columns than coeffs size and more rows than columns");
TriangularFactorType ret = TriangularFactorType::Zero(k, k);
const int j = V.rows();
for (int i = 0; i < k; i++)
{
ret.block(0, i, i, 1) = - coeffs(i)
* V.block(i, 0, j - i, i).transpose() * V.block(i, i, j -
i, 1);
ret.block(0, i, i, 1) = ret.block(0, 0, i, i) * ret.block(0,
i, i,
1);
ret(i, i) = coeffs(i);
}
return ret;
}
static
void
applyBlockHouseholderOnTheLeft(OtherDerived& A, const Derived& V,
const CoeffVect& coeffs)
{
const TriangularView<Derived,UnitLower>& Vtmp = V.template
triangularView<UnitLower> ();
const TriangularFactorType& T =
BlockHouseholderTriangularFactor(Vtmp,
coeffs,V);
A -= Vtmp * T.transpose() * Vtmp.transpose() * A;
}
}
With OtherDerived=Block<MatrixType>, and
MatrixType=Matrix<float,Dynamic,Dynamic>.
I'm forced to make a copy of the TriangularView :
const Derived& Vtmp(V.template triangularView<UnitLower> ());
(instead of
const TriangularView<Derived,UnitLower>& Vtmp = V.template
triangularView<UnitLower> ()
)
to get my code to work. However, this is a bad workaround, as it makes an
additional copy.
The overhead is not that big (0.1 s on an old P4 3Ghz for a 256x256
matrix), but it should not be there, as theoricaly, blocked computation are
better on big matrix...So I hope to improve things if I can get rid of this
avoidable matrix copy.
Thanks,
Vincent