Re: [eigen] Some trouble with triangularView

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

since such covariance computation is a very common task, I wrote a
small benchmark (see attached file) which compares 5 different methods
for various problem sizes. Here are the results:

f 2 2 20 => 0.0817518s , 0.113241s , 0.011873s , 1.03113s , 0.010354s
f 3 3 20 => 0.0501951s , 0.0669986s , 0.0116151s , 0.497608s , 0.00898168s
f 4 4 20 => 0.0236093s , 0.0331182s , 0.0138215s , 0.267377s , 0.00719802s
f -1 16 20 => 0.00588922s , 0.00762224s , 0.0155039s , 0.0437174s , 0.0380409s
f -1 128 20 => 0.00214637s , 0.00170324s , 0.00736009s , 0.0079626s , 0.0219131s
f 2 2 200 => 0.0456701s , 0.0557956s , 0.0118452s , 1.03082s , 0.0100666s
f 3 3 200 => 0.0349574s , 0.0413799s , 0.0113998s , 0.496733s , 0.00871355s
f 4 4 200 => 0.0144712s , 0.0188109s , 0.0134291s , 0.267343s , 0.00712597s
f -1 16 200 => 0.00474368s , 0.00555303s , 0.0162087s , 0.0427719s , 0.0369833s
f -1 128 200 => 0.00207239s , 0.00158966s , 0.00727162s , 0.00787104s
, 0.0219088s
f 2 2 2000 => 0.0437734s , 0.05367s , 0.011591s , 1.03008s , 0.00983182s
f 3 3 2000 => 0.0341709s , 0.0404823s , 0.0112701s , 0.49679s , 0.00871608s
f 4 4 2000 => 0.0151509s , 0.0194259s , 0.0133588s , 0.26493s , 0.00705146s
f -1 16 2000 => 0.00467992s , 0.00575335s , 0.0149987s , 0.0415939s , 0.0368721s
f -1 128 2000 => 0.00606985s , 0.00446139s , 0.024506s , 0.0265974s , 0.0735676s

f means float, then we have the covariance matrix size at compile time
(-1==Dynamic), then the actual covariance size, then the number of
input "points" and the times in second for each of the 5 methods.

For small fixed sizes, the best method seems to consistently be the
one looping through the points using the triangularView() syntax:

cov.setZero();
  for(int i=0; i<data.cols(); ++i)
  {
    Matrix<Scalar,Size,1> tmp = data.col(i) - data.col(0);
    cov.template triangularView<Lower>() += tmp * tmp.adjoint();
  }

while for large enough problems it is better to assemble the data
matrix into a temporary and then do cov = D * D^T at once as a rank
update operation:

cov.setZero();
cov.template selfadjointView<Lower>().rankUpdate( (data.colwise() -
data.col(0)).eval() );

currently the selfadjointView::rankUpdate function is tailored for
large matrices for the right hand side,but in the future it should be
the preferred method for all situations.

I used gcc 4.5.


Finally note that you usually don't need to compute the full matrix if
you use algorithms specialized for selfadjoint matrices (i.e.,
Cholesky, aka LLT, or SelfAdjointEigenSolver).
gael


On Fri, Sep 3, 2010 at 5:23 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:
> 2010/9/3 Eamon Nerbonne <eamon.nerbonne@xxxxxxxxx>:
>> To speed up covariance computations, I tried to use triangularView to avoid
>> computing unnecessary coefficients.  However, that didn't work as expected.
>>
>> To start with, there seems to be a g++ bug that prevents compiling the
>> following template:
>>
>> #include <Eigen/Core>
>> using namespace Eigen;
>> template <typename TPoints>
>>     inline static Matrix<typename
>> TPoints::Scalar,TPoints::RowsAtCompileTime,TPoints::RowsAtCompileTime>
>>     CovarianceX(Eigen::MatrixBase<TPoints>const & points) {
>>         typedef typename TPoints::Scalar Scalar;
>>         typedef Eigen::Matrix<Scalar,TPoints::RowsAtCompileTime,1> TPoint;
>>         typedef Matrix<typename
>> TPoints::Scalar,TPoints::RowsAtCompileTime,TPoints::RowsAtCompileTime>
>> TMatrix;
>>         TPoint mean = points.rowwise().sum() * (1.0/points.cols());
>>         TPoint diff = TPoint::Zero(points.rows());
>>         TMatrix cov = TMatrix::Zero(points.rows(),points.rows());
>>
>>         for(int i=0;i<points.cols();++i) {
>>             diff.noalias() = points.col(i) - mean;
>>             cov.triangularView<Eigen::Upper>() += diff * diff.transpose();
>> //line A
>>         }
>>
>>         cov.triangularView<Eigen::StrictlyLower>() = cov.transpose(); //line
>> B
>>         return cov * (1.0/(points.cols()-1..0));
>>     }
>>
>> int main(){return 0;}
>>
>> In particular, on lines A&B the compiler complains of "expected
>> primary-expression before ')' token".  If the function isn't templated it
>> compiles (and works).  In MSC, it works.  It looks like specifying a
>> template argument here breaks something - anyone know of a workaround?  I
>> tried using a templated class with a static function, but that fails
>> equally.
>
> This is just the usual c++ syntax issue, argh, we've answered this
> question dozens of times, I wonder where we could document it once and
> for all so users find it.
>
> You have to add a template keyword:
>
> cov.template triangularView<Eigen::Upper>() += ...
>
> GCC is right to reject this code without template.
>
>
>>
>> Now, even when this works, it's a lot slower than without triangularView,
>> which is odd.  At first I thought that's because I can't mix .noalias and
>> triangularView, but as far as I can tell, line A is using a lazy product
>> despite possible aliasing (isn't that a bug?) so the perf degredation may
>> simply be due to the greater code complexity.
>
> no idea... sorry. (no time to investigate now).
>
> Benoit
>
>>
>> --eamon@xxxxxxxxxxxx - Tel#:+31-6-15142163
>>
>
>
>
#include <bench/BenchTimer.h>
#include <iostream>
#include <typeinfo>
#include <Eigen/Eigen>
using namespace Eigen;

// #ifndef SCALAR
// #define SCALAR float
// #endif

// typedef SCALAR Scalar;

// static const int Size;
// typedef Matrix<Scalar,Size,Dynamic,RowMajor> Data;

template<typename Scalar,int Size, int StorageOrder>
EIGEN_DONT_INLINE void cov1(const Matrix<Scalar,Size,Dynamic,StorageOrder>& data, Matrix<Scalar,Size,Size,StorageOrder>& cov)
{
  Matrix<Scalar,Size,Dynamic,StorageOrder> tmp = data.colwise() - data.col(0);
  cov.noalias() = tmp * tmp.adjoint();
}

template<typename Scalar,int Size, int StorageOrder>
EIGEN_DONT_INLINE void cov2(const Matrix<Scalar,Size,Dynamic,StorageOrder>& data, Matrix<Scalar,Size,Size,StorageOrder>& cov)
{
  cov.setZero();
  Matrix<Scalar,Size,Dynamic,StorageOrder> tmp = data.colwise() - data.col(0);
  cov.template selfadjointView<Lower>().rankUpdate(data);
}

template<typename Scalar,int Size, int StorageOrder>
EIGEN_DONT_INLINE void cov3(const Matrix<Scalar,Size,Dynamic,StorageOrder>& data, Matrix<Scalar,Size,Size,StorageOrder>& cov)
{
  cov.setZero();
  for(int i=0; i<data.cols(); ++i)
  {
    Matrix<Scalar,Size,1> tmp = data.col(i) - data.col(0);
    cov.noalias() += tmp * tmp.adjoint();
  }
}

template<typename Scalar,int Size, int StorageOrder>
EIGEN_DONT_INLINE void cov4(const Matrix<Scalar,Size,Dynamic,StorageOrder>& data, Matrix<Scalar,Size,Size,StorageOrder>& cov)
{
  cov.setZero();
  for(int i=0; i<data.cols(); ++i)
  {
    Matrix<Scalar,Size,1> tmp = data.col(i) - data.col(0);
    cov.template selfadjointView<Lower>().rankUpdate(tmp);
  }
}

template<typename Scalar,int Size, int StorageOrder>
EIGEN_DONT_INLINE void cov5(const Matrix<Scalar,Size,Dynamic,StorageOrder>& data, Matrix<Scalar,Size,Size,StorageOrder>& cov)
{
  cov.setZero();
  for(int i=0; i<data.cols(); ++i)
  {
    Matrix<Scalar,Size,1> tmp = data.col(i) - data.col(0);
    cov.template triangularView<Lower>() += tmp * tmp.adjoint();
  }
}



template<typename Scalar,int Size, int StorageOrder>
void benchcov(int n, int size = Size)
{
  Matrix<Scalar,Size,Dynamic,StorageOrder> data(size,n);
  data.setRandom();
  Matrix<Scalar,Size,Size,StorageOrder> cov(size,size);
  cov.setZero();

  int tries = 4;
  int repeats = 10000000/(double(n)*size*size);
  repeats = repeats==0 ? 1 : repeats;

  BenchTimer t1, t2, t3, t4, t5;
  BENCH(t1, tries, repeats, cov1(data,cov));
  BENCH(t2, tries, repeats, cov2(data,cov));
  BENCH(t3, tries, repeats, cov3(data,cov));
  BENCH(t4, tries, repeats, cov4(data,cov));
  BENCH(t5, tries, repeats, cov5(data,cov));
  
  std::cout << typeid(Scalar).name() << " " << Size << " " << size << " " << n
            << " => " << t1.best()
            << "s , " << t2.best()
            << "s , " << t3.best()
            << "s , " << t4.best()
            << "s , " << t5.best()
            << "s\n";
}

int main()
{
  int n = 20;
  benchcov<float,2,ColMajor>(n);
  benchcov<float,3,ColMajor>(n);
  benchcov<float,4,ColMajor>(n);
  benchcov<float,Dynamic,ColMajor>(n,16);
  benchcov<float,Dynamic,ColMajor>(n,128);
  n = 200;
  benchcov<float,2,ColMajor>(n);
  benchcov<float,3,ColMajor>(n);
  benchcov<float,4,ColMajor>(n);
  benchcov<float,Dynamic,ColMajor>(n,16);
  benchcov<float,Dynamic,ColMajor>(n,128);
  n = 2000;
  benchcov<float,2,ColMajor>(n);
  benchcov<float,3,ColMajor>(n);
  benchcov<float,4,ColMajor>(n);
  benchcov<float,Dynamic,ColMajor>(n,16);
  benchcov<float,Dynamic,ColMajor>(n,128);
  return 0;
}


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/