Re: [eigen] Implementation of M.transpose()*M & M*M.transpose()

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


> Before doing experiments, I do not know whether it would be terribly slow.

Jinglei, for N > L2 cache size you will be hitting main memory for
every reference to v2d[i][j].  Furthermore, double-dereferencing of
pointers makes it almost impossible for the compiler to optimize your
code in any meaningful way.  I guarantee you your code example would
be terribly slow as Gael describes.  Writing efficient linear algebra
code is very hard.  There has been a lot of care and attention in
Eigen to details like this so that you don't have to worry about them.
 As Gael mentioned earlier you can use the following code from the
developer's branch of Eigen to do what you want:

>> A.setZero()
>> A.selfadjointView<Upper>().rankUpdate(M);  // A += M.transpose()*M
>>
>> and you can compute A * B as follow:
>>
>> A.selfadjointView<Upper>() * B
>>

Warm Regards,
Aron

On Sat, Feb 20, 2010 at 6:55 PM, Jinglei Hu <jingleihu@xxxxxxxxx> wrote:
> Before doing experiments, I do not know whether it would be terribly slow.
>
> I doubt there would be much difference, but I believe using different
> compilers with different optimization options does lead to big difference
> for the same code.
>
> ----- Original Message -----
>
> From: Gael Guennebaud
> To: eigen@xxxxxxxxxxxxxxxxxxx
> Sent: 2010年2月20日 18:25
> Subject: Re: [eigen] Implementation of M.transpose()*M & M*M.transpose()
>
>
> On Sat, Feb 20, 2010 at 5:38 PM, Jinglei Hu <jingleihu@xxxxxxxxx> wrote:
>>
>> Well, by trivial I mean one can write something as follows(not using
>> Eigen):
>>
>> Matrix M_Mult_MT(const Matrix& Rhs) {
>>   // [m][n] * [n][m] --> [m][m]
>>   register int N(Rhs.num_rows());
>>   double v[N][N];
>>   double** v2d(Rhs);
>>   for (int i = 0; i < N; ++i) {
>>     double* vi = &(v[i][0]);
>>     double* pi = v2d[i];
>>     for (int j = i; j < N; ++j) {
>>       double sum = 0.0;
>>       double* pj = v2d[j];
>>       for (int k = 0; k < Rhs.num_cols(); ++k)
>>         sum += pi[k] * pj[k];
>>       vi[j] = v[j][i] = sum;
>>     }
>>   }
>>   return Matrix(N, N, &(v[0][0]));
>> }
>
> yes but that would be terribly slow because here you do not take care of the
> caches, vectorization, etc.
>
> gael
>
>>
>>
>>
>> ----- Original Message -----
>> From: Gael Guennebaud
>> To: eigen@xxxxxxxxxxxxxxxxxxx
>> Sent: 2010年2月20日 16:27
>> Subject: Re: [eigen] Implementation of M.transpose()*M & M*M.transpose()
>>
>>
>> 2010/2/20 Jinglei Hu <jingleihu@xxxxxxxxx>
>>>
>>> Hi,
>>>
>>> I'm curious how M.transpose()*M and M*M.transpose() is implemented in
>>> Eigen by using Expression Template. Or it's just the same as the usual
>>> multiplication of two matrices?
>>
>> Currently A = M.transpose()*M involves a full matrix multiplication. If
>> you want to compute only an half, with the devel branch you can do:
>>
>> A.setZero()
>> A.selfadjointView<Upper>().rankUpdate(M);  // A += M.transpose()*M
>>
>> and you can compute A * B as follow:
>>
>> A.selfadjointView<Upper>() * B
>>
>> This is the preferred way if you only need to have one half (in most of
>> the case it is enough). If you really need to have the full result then
>> check you really really need it ;) Ok so if you really really need it then
>> do:
>> A.triangularView<Lower>() = A.transpose();
>>
>> Actually, it is quite easy to detect that case in the matrix product, so
>> in the future you can expect that A = M.transpose()*M will be automatically
>> optimized as in Atlas for instance.
>>
>>
>> gael
>>
>>>
>>> The fact that the resultant matrix is symmetric can be used to save
>>> computation cost especially for large matrix. For example, the Hessian
>>> matrix, often used in nonlinear optimization/minimization, is actually
>>> obtained as Jacobian.transpose() * Jacobian. It is trivial to write a
>>> efficient function to do this by considering the symmetry.
>>>
>>
>> not that trivial when you consider cache performance as we do... have a
>> look at the file in Eigen/src/Core/products/SelfadjointProduct.h.
>>
>>>
>>> But anyway it is less elegant in the sense that the code does not look
>>> nice, e.g. Matrix hessian = MT_Mult_M(Jacobian).
>>>
>>> Any idea to do this efficiently by sticking to Expression Template? A
>>> different but somewhat similar case is the multiplication of normal matrix
>>> by diagonal matrix.
>>
>> this one is already optimized.
>>
>>>
>>>
>>> Cheers,
>>> Jinglei Hu
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/