Re: [eigen] RcppEigen, providing bindings from R to Eigen through Rcpp, has been released

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Thank you.  This will be incorporated in the next release.

On Tue, Jun 28, 2011 at 3:51 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> Hi,
>
> thanks for sharing this.
>
> After a very very brief look at the source code I noticed:
>
> VectorXd   D = UDV.singularValues();
> m_r          = std::count_if(D.data(), D.data() + m_p,
>                     std::bind2nd(std::greater<double>(), threshold() * D[0]));
>
> which can be written:
>
> m_r = (UDV.singularValues().array() > threshold() * D[0]).count();
>
> gael
>
> On Tue, Jun 28, 2011 at 7:33 PM, Douglas Bates <bates@xxxxxxxxxxxxx> wrote:
>> An initial release of the R package RcppEigen, which provides bindings
>> to Eigen from R using the Rcpp template package for R objects in C++,
>> is now available on CRAN, the Comprehensive R Archive Network.  See
>> http://cran.r-project.org/web/packages/RcppEigen or the equivalent
>> location on one of the mirror sites such as
>> http://cran.us.r-project.org/
>>
>> R is a language and system for statistical analysis and graphics.  See
>> http://www.R-project.org or http://en.wikipedia.org/wiki/R_language
>> for more details.  It is a interpreted, functional programming
>> language based on self-describing objects, and provides for extension
>> through user-contributed R packages.  The Rcpp package provides
>> templated C++ classes representing the underlying R object types.  The
>> RcppEigen package provides linkage to Eigen classes through the Rcpp
>> classes.  Several examples of using Eigen classes to fit a linear
>> model and provide the standard errors of the coefficient estimates are
>> given in the source package file RcppEigen/src/fastLm.cpp.  The source
>> package is provided as a .tar.gz file on the CRAN site mentioned
>> above.  Comments on and enhancements to this example code are welcome.
>>  I'm still a relative neophyte in Eigen.
>>
>> There is a simple benchmark of these methods that can be run in the
>> following way.
>>
>> 1. Install R
>> 2. Install the RcppEigen package by starting R and entering
>>
>> install.packages("RcppEigen")
>>
>>   This will also install the Rcpp package.
>>
>> 3. In the current R session or in any other R session enter
>>
>> library(RcppEigen)  # attaches the RcppEigen package
>> source(system.file("examples", "lmBenchmark.R", package="RcppEigen"), echo=TRUE)
>>
>>   This runs 20 replications of fitting a linear model with model
>> matrix of size 100000 by 40.  Other problem sizes and numbers of
>> replications can be specified with the optional arguments n, p and
>> nrep to the function do_bench().  If the RcppArmadillo and/or the
>> RcppGSL packages are available methods based on them are also used.
>> On my desktop (Athlon II X4 2.9 GHz, 8 GB PC1333 memory,
>> single-threaded atlas compiled on this machine) the results are
>>
>>> do_bench()
>> lm benchmark for n = 100000 and p = 40: nrep = 20
>>     test   relative elapsed user.self sys.self
>> 7     LLt   1.000000   1.132      1.14     0.00
>> 3    LDLt   1.019435   1.154      1.16     0.00
>> 5 SymmEig   2.932862   3.320      2.56     0.75
>> 2   PivQR   7.891343   8.933      8.26     0.67
>> 8    arma   7.897527   8.940      8.64     0.02
>> 6      QR   8.679329   9.825      9.07     0.74
>> 1  lm.fit  13.082155  14.809     13.38     1.40
>> 4     SVD  64.440813  72.947     71.86     1.01
>> 9     GSL 156.208481 176.828    175.86     0.80
>>
>> The lm.fit timing is from the native R implementation based on a
>> modified QR decomposition from Linpack (hence using only level-1
>> BLAS).  The reason for using Linpack is because we want a specific
>> column pivoting scheme, which is to retain the original column order
>> except in the case or apparent singularity when the current pivot
>> column is moved to the right hand side via a cyclic shift.
>>
>> Strictly speaking the lm.fit results should only be compared to
>> rank-revealing decompositions of which, at present, only the PivQR
>> method based on the colPivHouseholderQr method is a rank-revealing
>> implementation. However, the SymmEig and SVD methods can and will be
>> adapted.  I think that the LDLt method can also be adapted even though
>> the pivoting scheme doesn't guarantee decreasing magnitudes of the
>> elements of the diagonal vector D.
>>
>> The RcppGSL implementation uses the gsl_multifit_linear function that
>> is based on a singular value decomposition.  The RcppArmadillo
>> implementation uses the arma::solve function for overdetermined
>> systems.  It is based on the Lapack subroutine dgels.
>>
>> On another machine using the reference BLAS I get
>>
>> lm benchmark for n = 100000 and p = 40: nrep = 20
>>     test   relative elapsed user.self sys.self
>> 3    LDLt   1.000000   1.593     1.591    0.000
>> 7     LLt   1.001255   1.595     1.592    0.001
>> 5 SymmEig   2.787822   4.441     3.674    0.758
>> 6      QR   6.760829  10.770    10.010    0.752
>> 2   PivQR   6.914626  11.015    10.225    0.777
>> 1  lm.fit   9.943503  15.840    14.288    1.537
>> 8    arma  13.971751  22.257    22.202    0.010
>> 4     SVD  41.958569  66.840    65.609    1.158
>> 9     GSL 162.653484 259.107   257.944    0.932
>>
>> Bumping up the size to 1 000 000 by 100 I get, on the machine with atlas,
>>
>>> do_bench(1000000, 100, 10)
>> lm benchmark for n = 1e+06 and p = 100: nrep = 10
>>     test   relative  elapsed user.self sys.self
>> 3    LDLt   1.000000   25.605     25.49     0.10
>> 7     LLt   1.047491   26.821     26.70     0.10
>> 5 SymmEig   1.882054   48.190     66.61    10.05
>> 6      QR   7.443937  190.602    181.92     8.56
>> 8    arma   8.933138  228.733    224.29     4.22
>> 2   PivQR   8.981644  229.975    221.70     8.11
>> 1  lm.fit  12.003163  307.341    272.16    34.91
>> 4     SVD  78.411873 2007.736   2012.31    21.72
>> 9     GSL 210.777465 5396.957   5382.61    10.36
>>
>> and on the machine with the reference BLAS
>>
>>
>>> do_bench(1000000L, 100L, 20L, TRUE)
>> lm benchmark for n = 1000000 and p = 100: nrep = 20
>>     test  relative  elapsed user.self sys.self
>> 3    LDLt  1.000000   69.297    69.096    0.174
>> 6     LLt  1.039295   72.020    71.750    0.203
>> 4 SymmEig  2.828102  195.979   176.954   18.961
>> 5      QR  8.990534  623.017   603.931   18.530
>> 2   PivQR 12.731807  882.276   863.076   18.474
>> 1  lm.fit 17.585884 1218.649  1133.541   84.174
>> 7    arma 30.593186 2120.016  2108.481   10.087
>>
>>
>>
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/