Re: [eigen] Matrix product between some MatrixType and triangularView<UnitLower> |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
I've tested the code on my ATOM powered netbook, I get a performance
improvement of 25% over unblocked QR decomposition even with no .noalias()
on triangularView product. (for 256x256 float matrix, 2s vs ~2.5s)
I will update my eigen ref code asap to get things to work with
triangularView.
Thanks !
On Tue, 15 Jun 2010 23:49:54 +0200, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> On Tue, Jun 15, 2010 at 11:05 PM, <vincent.lejeune@xxxxxxxxxx> wrote:
>>
>> Thanks !
>>
>>
>>
>> When looking at the new code, I have some questions though :
>
> right I forgot to comment some changes !
>
>>
>> - Is .col(i).head(i) better than .block(0,i,i-1,1) for performance, or
is
>>
>> it just for readability purpose ? I was using block() because I was
sure
>> it
>>
>> was doing computation inplace (and because I'm not used to this syntax
>> too)
>
> yes because then Eigen knows at compile time you are dealing with a
> one dimensional vector and so we can generate faster code and select
> appropriate products. The gain can be huge, especially when dealing
> with matrix products.
>
>>
>> - V.const_cast_derived() is the same as const_cast<...>(V) ?
>
> in this context yes. If you have a MatrixBase<Derived> then it return
> a Derived& type. I used it here because it avoids to explicitly write
> the type.
>
>> - Is there any difference between .transpose() and .adjoint for double
>>
>> real matrix ? I didn't try to get the routine work for complex values,
>> so I
>>
>> can't ensure that BlockHouseholderTriangularFactor is valid for complex
>>
>> value yet.
>
> For non complex number they compile to the same thing. So better use
> it. Supporting complex numbers will be easier.
>
>>
>> - What coding standard should I change if I want this file to be
included
>>
>> in the experimental/ directory of Eigen ?
>
> Actually, we want to support block Householder transformations
> directly at the level of the HouseholderSequence type and in the
> related dec like HouseholderQR, Tridiagonalization, etc.. So that's
> why I'm particularly interested in helping you to get it right. Then
> integrating it into Eigen will just be a matter of finding the right
> API.
>
> cheers,
>
> gael
>
>>
>>
>>
>> Thanks, Vincent
>>
>>
>>
>> On Tue, 15 Jun 2010 22:03:01 +0200, Gael Guennebaud
>>
>> <gael.guennebaud@xxxxxxxxx> wrote:
>>
>>> ok, the TriangularView class has been extended to trapezoidal matrices
>>
>>> but not the underlying matrix products whence your issue. Problem
>>
>>> solved. I've also attached a modified version of your file which works
>>
>>> here.
>>
>>>
>>
>>> gael
>>
>>>
>>
>>> On Tue, Jun 15, 2010 at 5:01 PM, <vincent.lejeune@xxxxxxxxxx> wrote:
>>
>>>> &On Tue, 15 Jun 2010 16:29:12 +0200, Gael Guennebaud
>>
>>>>
>>
>>>> <gael.guennebaud@xxxxxxxxx> wrote:
>>
>>>>
>>
>>>>> Actually, the pb is line 50 where you store a reference to the
>>
>>>>
>>
>>>>> temporary matrix returned by BlockHouseholderTriangularFactor.
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> I would rather write it like this:
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> const TriangularView<Derived,UnitLower>& Vtri(V);
>>
>>>>
>>
>>>>> TriangularFactorType T = BlockHouseholderTriangularFactor(Vtmp,
>>
>>>>
>>
>>>> coeffs,V);
>>
>>>>
>>
>>>>> A -= Vtri * T.transpose() * Vtri.transpose() * A;
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>> When using "const TriangularView<Derived,UnitLower>& Vtri(V)" the
>>
>> result
>>
>>>>
>>
>>>> is changed :/
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>> I've joined the result of a computation on some 8x8 matrix (with a
>>
>> block
>>
>>>>
>>
>>>> factor of 2) and,
>>
>>>>
>>
>>>> as you can the, the 4x4 right bottom corner is wrong...
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> But here there are still many many temporaries: one for T and one
for
>>
>>>>
>>
>>>>> each matrix product! Also here the matrix T is triangular right ? So
>>
>>>>
>>
>>>>> I'd suggest the following:
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> Derived::PlainObject tmp(Vtri.transpose() * A);
>>
>>>>
>>
>>>>> tmp.noalias() = T.transpose().triangularView<Upper>() * tmp;
>>
>>>>
>>
>>>>> A.noalias() -= Vtri * tmp;
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> There are still 2 temporaries, one for T and for tmp, but that's
still
>>
>>>>
>>
>>>>> better than 4. I'm also not 100% sure about the second line, because
>>
>>>>
>>
>>>>> in theory a triangular matrix times a matrix can be performed
in-place
>>
>>>>
>>
>>>>> but I have to check the implementation. If you have troubles just
>>
>>>>
>>
>>>>> remove the .noalias().
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>> What is the purpose of noalias() ?
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> But this still looks relatively expensive in term of temporaries,
>>
>>>>
>>
>>>>> there must be something smarter.
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> gael
>>
>>>>
>>
>>>>>
>>
>>>>
>>
>>>>> On Tue, Jun 15, 2010 at 4:08 PM, <vincent.lejeune@xxxxxxxxxx>
wrote:
>>
>>>>
>>
>>>>>> Here is my source code, the part of interest is at the top
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> (HHUtilities<..>).
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> The incrimined lines is the 47/48 : with line 47, everything works,
>>
>> but
>>
>>>>
>>
>>>>>> it
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> does a copy I would like to avoid.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> With line 48, I do no copy, but it crashs.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> "Derived" is Matrix<float,Dynamic,Dynamic> and OtherDerived is
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> Block<MatrixType> where MatrixType is the previous
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> Matrix<float,Dynamic,Dynamic> one.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> On Tue, 15 Jun 2010 16:02:04 +0200, Gael Guennebaud
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> <gael.guennebaud@xxxxxxxxx> wrote:
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>> Hi,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>> could you resend your piece of code in a text file please, because
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>> inlined in the email, it is really hard to read !
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>> Regarding the previous issues, what was it ?
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>> cheers,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>> gael
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>> On Tue, Jun 15, 2010 at 3:42 PM, <vincent.lejeune@xxxxxxxxxx>
>>
>> wrote:
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> Hi,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> I fixed the issue I had that prevent me from getting a working
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> implementation of QR Householder decomposition, and apply
>>
>> Householder
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> on
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> the left/right function.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> However, Im' currently facing another issue : my implementation
is
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> slower
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> than the unblocked version of theses functions, because I'm
relying
>>
>>>>
>>
>>>> on
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> the
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> "triangularView" template.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> As I understand, matrix.triangularView() does not copy the matrix
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> object,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> and provides some comfort. However, the following
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> applyBlockHouseholderOnTheLeft function does trigger a segfault :
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> template<typename Derived, typename OtherDerived = Matrix<
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> typename Derived::Scalar, Derived::ColsAtCompileTime,
>>>>>>>> Dynamic>
>>
>>>>>>>> >
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> class HHutilities
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> {
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> public:
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> typedef Matrix<typename Derived::Scalar,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> Derived::ColsAtCompileTime,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> Derived::ColsAtCompileTime> TriangularFactorType;
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> typedef typename ei_plain_diag_type<Derived>::type
CoeffVect;
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> TriangularFactorType
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> static
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> BlockHouseholderTriangularFactor(const Derived& V,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const CoeffVect& coeffs,const Derived& VV)
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> {
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const int k = V.cols();
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> ei_assert(V.cols()==k && V.rows()>=k && "Vector storing
>>
>>>>>>>> Matrix
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> must have same columns than coeffs size and more rows than
>>
>> columns");
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> TriangularFactorType ret = TriangularFactorType::Zero(k,
>>>>>>>> k);
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const int j = V.rows();
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> for (int i = 0; i < k; i++)
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> {
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> ret.block(0, i, i, 1) = - coeffs(i)
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> * V.block(i, 0, j - i, i).transpose() * V.block(i,
>>
>> i,
>>
>>>>
>>
>>>> j
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> -
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>> <script type="text/javascript"
>>
>>>>
>>
>>>>
>>
>>
src="https://webmail.scilab.org/roundcube/program/js/tiny_mce/themes/advanced/langs/fr.js?s=1240817786"></script>gt;>
>>
>>>>
>>
>>>>>>>> i, 1);
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> ret.block(0, i, i, 1) = ret.block(0, 0, i, i) *
>>
>>>>
>>
>>>>>>>> ret.block(0,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> i, i,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> 1);
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> ret(i, i) = coeffs(i);
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> }
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> return ret;
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> }
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> static
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> void
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> applyBlockHouseholderOnTheLeft(OtherDerived& A, const
>>>>>>>> Derived&
>>
>>>>
>>
>>>> V,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const CoeffVect& coeffs)
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> {
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const TriangularView<Derived,UnitLower>& Vtmp =
>>>>>>>> V.template
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> triangularView<UnitLower> ();
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const TriangularFactorType& T =
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> BlockHouseholderTriangularFactor(Vtmp,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> coeffs,V);
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> A -= Vtmp * T.transpose() * Vtmp.transpose() * A;
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> }
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> }
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> With OtherDerived=Block<MatrixType>, and
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> MatrixType=Matrix<float,Dynamic,Dynamic>.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> I'm forced to make a copy of the TriangularView :
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const Derived& Vtmp(V.template triangularView<UnitLower> ());
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> (instead of
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> const TriangularView<Derived,UnitLower>& Vtmp = V.template
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> triangularView<UnitLower> ()
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> )
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> to get my code to work. However, this is a bad workaround, as it
>>
>>>>
>>
>>>> makes
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>> an
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> additional copy.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> The overhead is not that big (0.1 s on an old P4 3Ghz for a
256x256
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> matrix), but it should not be there, as theoricaly, blocked
>>
>>>>
>>
>>>> computation
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> are
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> better on big matrix...So I hope to improve things if I can get
rid
>>
>>>>
>>
>>>> of
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> this
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> avoidable matrix copy.
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> Thanks,
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>> Vincent
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>>>
>>
>>>>>>
>>
>>>>
>>
>>>>>>>>
>>
>>
>>