Re: [eigen] Matrix product between some MatrixType and triangularView<UnitLower>

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


I've tested the code on my ATOM powered netbook, I get a performance

improvement of 25% over unblocked QR decomposition even with no .noalias()

on triangularView product. (for 256x256 float matrix, 2s vs ~2.5s)

I will update my eigen ref code asap to get things to work with

triangularView.



Thanks !



On Tue, 15 Jun 2010 23:49:54 +0200, Gael Guennebaud

<gael.guennebaud@xxxxxxxxx> wrote:

> On Tue, Jun 15, 2010 at 11:05 PM,  <vincent.lejeune@xxxxxxxxxx> wrote:

>>

>> Thanks !

>>

>>

>>

>> When looking at the new code, I have some questions though :

> 

> right I forgot to comment some changes !

> 

>>

>> - Is .col(i).head(i) better than .block(0,i,i-1,1) for performance, or

is

>>

>> it just for readability purpose ? I was using block() because I was

sure

>> it

>>

>> was doing computation inplace (and because I'm not used to this syntax

>> too)

> 

> yes because then Eigen knows at compile time you are dealing with a

> one dimensional vector and so we can generate faster code and select

> appropriate products. The gain can be huge, especially when dealing

> with matrix products.

> 

>>

>> - V.const_cast_derived() is the same as const_cast<...>(V) ?

> 

> in this context yes. If you have a MatrixBase<Derived> then it return

> a Derived& type. I used it here because it avoids to explicitly write

> the type.

> 

>> - Is there any difference between .transpose() and .adjoint for double

>>

>> real matrix ? I didn't try to get the routine work for complex values,

>> so I

>>

>> can't ensure that BlockHouseholderTriangularFactor is valid for complex

>>

>> value yet.

> 

> For non complex number they compile to the same thing. So better use

> it. Supporting complex numbers will be easier.

> 

>>

>> - What coding standard should I change if I want this file to be

included

>>

>> in the experimental/ directory of Eigen ?

> 

> Actually, we want to support block Householder transformations

> directly at the level of the HouseholderSequence type and in the

> related dec like HouseholderQR, Tridiagonalization, etc.. So that's

> why I'm particularly interested in helping you to get it right. Then

> integrating it into Eigen will just be a matter of finding the right

> API.

> 

> cheers,

> 

> gael

> 

>>

>>

>>

>> Thanks, Vincent

>>

>>

>>

>> On Tue, 15 Jun 2010 22:03:01 +0200, Gael Guennebaud

>>

>> <gael.guennebaud@xxxxxxxxx> wrote:

>>

>>> ok, the TriangularView class has been extended to trapezoidal matrices

>>

>>> but not the underlying matrix products whence your issue. Problem

>>

>>> solved. I've also attached a modified version of your file which works

>>

>>> here.

>>

>>>

>>

>>> gael

>>

>>>

>>

>>> On Tue, Jun 15, 2010 at 5:01 PM,  <vincent.lejeune@xxxxxxxxxx> wrote:

>>

>>>> &On Tue, 15 Jun 2010 16:29:12 +0200, Gael Guennebaud

>>

>>>>

>>

>>>> <gael.guennebaud@xxxxxxxxx> wrote:

>>

>>>>

>>

>>>>> Actually, the pb is line 50 where you store a reference to the

>>

>>>>

>>

>>>>> temporary  matrix returned by BlockHouseholderTriangularFactor.

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> I would rather write it like this:

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> const TriangularView<Derived,UnitLower>& Vtri(V);

>>

>>>>

>>

>>>>> TriangularFactorType T = BlockHouseholderTriangularFactor(Vtmp,

>>

>>>>

>>

>>>> coeffs,V);

>>

>>>>

>>

>>>>> A -= Vtri * T.transpose() * Vtri.transpose() * A;

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>> When using "const TriangularView<Derived,UnitLower>& Vtri(V)" the

>>

>> result

>>

>>>>

>>

>>>> is changed :/

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>> I've joined the result of a computation on some 8x8 matrix (with a

>>

>> block

>>

>>>>

>>

>>>> factor of 2) and,

>>

>>>>

>>

>>>> as you can the, the 4x4 right bottom corner is wrong...

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> But here there are still many many temporaries: one for T and one

for

>>

>>>>

>>

>>>>> each matrix product! Also here the matrix T is triangular right ? So

>>

>>>>

>>

>>>>> I'd suggest the following:

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> Derived::PlainObject tmp(Vtri.transpose() * A);

>>

>>>>

>>

>>>>> tmp.noalias() = T.transpose().triangularView<Upper>() * tmp;

>>

>>>>

>>

>>>>> A.noalias() -= Vtri * tmp;

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> There are still 2 temporaries, one for T and for tmp, but that's

still

>>

>>>>

>>

>>>>> better than 4. I'm also not 100% sure about the second line, because

>>

>>>>

>>

>>>>> in theory a triangular matrix times a matrix can be performed

in-place

>>

>>>>

>>

>>>>> but I have to check the implementation. If you have troubles just

>>

>>>>

>>

>>>>> remove the .noalias().

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>> What is the purpose of noalias() ?

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> But this still looks relatively expensive in term of temporaries,

>>

>>>>

>>

>>>>> there must be something smarter.

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> gael

>>

>>>>

>>

>>>>>

>>

>>>>

>>

>>>>> On Tue, Jun 15, 2010 at 4:08 PM,  <vincent.lejeune@xxxxxxxxxx>

wrote:

>>

>>>>

>>

>>>>>> Here is my source code, the part of interest is at the top

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> (HHUtilities<..>).

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> The incrimined lines is the 47/48 : with line 47, everything works,

>>

>> but

>>

>>>>

>>

>>>>>> it

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> does a copy I would like to avoid.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> With line 48, I do no copy, but it crashs.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> "Derived" is Matrix<float,Dynamic,Dynamic> and OtherDerived is

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> Block<MatrixType> where MatrixType is the previous

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> Matrix<float,Dynamic,Dynamic> one.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> On Tue, 15 Jun 2010 16:02:04 +0200, Gael Guennebaud

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> <gael.guennebaud@xxxxxxxxx> wrote:

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>> Hi,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>> could you resend your piece of code in a text file please, because

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>> inlined in the email, it is really hard to read !

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>> Regarding the previous issues, what was it ?

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>> cheers,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>> gael

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>> On Tue, Jun 15, 2010 at 3:42 PM,  <vincent.lejeune@xxxxxxxxxx>

>>

>> wrote:

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> Hi,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> I fixed the issue I had that prevent me from getting a working

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> implementation of QR Householder decomposition, and apply

>>

>> Householder

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> on

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> the left/right function.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> However, Im' currently facing another issue : my implementation

is

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> slower

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> than the unblocked version of theses functions, because I'm

relying

>>

>>>>

>>

>>>> on

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> the

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> "triangularView" template.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> As I understand, matrix.triangularView() does not copy the matrix

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> object,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> and provides some comfort. However, the following

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> applyBlockHouseholderOnTheLeft function does trigger a segfault :

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>  template<typename Derived, typename OtherDerived = Matrix<

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      typename Derived::Scalar, Derived::ColsAtCompileTime,

>>>>>>>> Dynamic>

>>

>>>>>>>> >

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>    class HHutilities

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>    {

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>    public:

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      typedef Matrix<typename Derived::Scalar,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> Derived::ColsAtCompileTime,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>          Derived::ColsAtCompileTime> TriangularFactorType;

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      typedef typename ei_plain_diag_type<Derived>::type

CoeffVect;

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      TriangularFactorType

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      static

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      BlockHouseholderTriangularFactor(const Derived& V,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>          const CoeffVect& coeffs,const Derived& VV)

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      {

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        const int k = V.cols();

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        ei_assert(V.cols()==k && V.rows()>=k && "Vector storing

>>

>>>>>>>> Matrix

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> must have same columns than coeffs size and more rows than

>>

>> columns");

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        TriangularFactorType ret = TriangularFactorType::Zero(k,

>>>>>>>> k);

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        const int j = V.rows();

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        for (int i = 0; i < k; i++)

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>          {

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>            ret.block(0, i, i, 1) = -  coeffs(i)

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>                * V.block(i, 0, j - i, i).transpose() * V.block(i,

>>

>> i,

>>

>>>>

>>

>>>> j

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> -

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>> <script type="text/javascript"

>>

>>>>

>>

>>>>

>>

>>

src="https://webmail.scilab.org/roundcube/program/js/tiny_mce/themes/advanced/langs/fr.js?s=1240817786";></script>gt;>

>>

>>>>

>>

>>>>>>>> i, 1);

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>            ret.block(0, i, i, 1) = ret.block(0, 0, i, i) *

>>

>>>>

>>

>>>>>>>> ret.block(0,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> i, i,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>                1);

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>            ret(i, i) = coeffs(i);

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>          }

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        return ret;

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      }

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      static

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      void

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      applyBlockHouseholderOnTheLeft(OtherDerived& A, const

>>>>>>>> Derived&

>>

>>>>

>>

>>>> V,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>          const CoeffVect& coeffs)

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      {

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>          const TriangularView<Derived,UnitLower>& Vtmp =

>>>>>>>> V.template

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> triangularView<UnitLower> ();

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        const TriangularFactorType& T =

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> BlockHouseholderTriangularFactor(Vtmp,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>            coeffs,V);

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>        A -= Vtmp * T.transpose() * Vtmp.transpose() * A;

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>      }

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> }

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> With OtherDerived=Block<MatrixType>, and

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> MatrixType=Matrix<float,Dynamic,Dynamic>.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> I'm forced to make a copy of the TriangularView :

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> const Derived& Vtmp(V.template triangularView<UnitLower> ());

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> (instead of

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> const TriangularView<Derived,UnitLower>& Vtmp = V.template

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> triangularView<UnitLower> ()

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> )

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> to get my code to work. However, this is a bad workaround, as it

>>

>>>>

>>

>>>> makes

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>> an

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> additional copy.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> The overhead is not that big (0.1 s on an old P4 3Ghz for a

256x256

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> matrix), but it should not be there, as theoricaly, blocked

>>

>>>>

>>

>>>> computation

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> are

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> better on big matrix...So I hope to improve things if I can get

rid

>>

>>>>

>>

>>>> of

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> this

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> avoidable matrix copy.

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> Thanks,

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>> Vincent

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>>>

>>

>>>>>>

>>

>>>>

>>

>>>>>>>>

>>

>>

>>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/