Re: [eigen] RcppEigen, providing bindings from R to Eigen through Rcpp, has been released |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] RcppEigen, providing bindings from R to Eigen through Rcpp, has been released
- From: Douglas Bates <bates@xxxxxxxxxxxxx>
- Date: Tue, 28 Jun 2011 16:07:07 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=XpQQ3EOhSkD4Ud4OsnKzJnW0De4sc+mehSPG3lCclYY=; b=WAfAXoYtAREgONXV0tWB8muj7IRfPQoUcqX/0qrrkkRbgir/Yp0F+MnS8YrIzu0S9/ WyzlUm16wj8K56dWLkjwulP3cCU8PB8RZHOlkG83hvUHtrEWLiXKutU1wPWYHDKWa4x7 lWEWJAy7742hvXZt+1JhsrIYmwL3rLbTC/5OE=
Thank you. This will be incorporated in the next release.
On Tue, Jun 28, 2011 at 3:51 PM, Gael Guennebaud
<gael.guennebaud@xxxxxxxxx> wrote:
> Hi,
>
> thanks for sharing this.
>
> After a very very brief look at the source code I noticed:
>
> VectorXd D = UDV.singularValues();
> m_r = std::count_if(D.data(), D.data() + m_p,
> std::bind2nd(std::greater<double>(), threshold() * D[0]));
>
> which can be written:
>
> m_r = (UDV.singularValues().array() > threshold() * D[0]).count();
>
> gael
>
> On Tue, Jun 28, 2011 at 7:33 PM, Douglas Bates <bates@xxxxxxxxxxxxx> wrote:
>> An initial release of the R package RcppEigen, which provides bindings
>> to Eigen from R using the Rcpp template package for R objects in C++,
>> is now available on CRAN, the Comprehensive R Archive Network. See
>> http://cran.r-project.org/web/packages/RcppEigen or the equivalent
>> location on one of the mirror sites such as
>> http://cran.us.r-project.org/
>>
>> R is a language and system for statistical analysis and graphics. See
>> http://www.R-project.org or http://en.wikipedia.org/wiki/R_language
>> for more details. It is a interpreted, functional programming
>> language based on self-describing objects, and provides for extension
>> through user-contributed R packages. The Rcpp package provides
>> templated C++ classes representing the underlying R object types. The
>> RcppEigen package provides linkage to Eigen classes through the Rcpp
>> classes. Several examples of using Eigen classes to fit a linear
>> model and provide the standard errors of the coefficient estimates are
>> given in the source package file RcppEigen/src/fastLm.cpp. The source
>> package is provided as a .tar.gz file on the CRAN site mentioned
>> above. Comments on and enhancements to this example code are welcome.
>> I'm still a relative neophyte in Eigen.
>>
>> There is a simple benchmark of these methods that can be run in the
>> following way.
>>
>> 1. Install R
>> 2. Install the RcppEigen package by starting R and entering
>>
>> install.packages("RcppEigen")
>>
>> This will also install the Rcpp package.
>>
>> 3. In the current R session or in any other R session enter
>>
>> library(RcppEigen) # attaches the RcppEigen package
>> source(system.file("examples", "lmBenchmark.R", package="RcppEigen"), echo=TRUE)
>>
>> This runs 20 replications of fitting a linear model with model
>> matrix of size 100000 by 40. Other problem sizes and numbers of
>> replications can be specified with the optional arguments n, p and
>> nrep to the function do_bench(). If the RcppArmadillo and/or the
>> RcppGSL packages are available methods based on them are also used.
>> On my desktop (Athlon II X4 2.9 GHz, 8 GB PC1333 memory,
>> single-threaded atlas compiled on this machine) the results are
>>
>>> do_bench()
>> lm benchmark for n = 100000 and p = 40: nrep = 20
>> test relative elapsed user.self sys.self
>> 7 LLt 1.000000 1.132 1.14 0.00
>> 3 LDLt 1.019435 1.154 1.16 0.00
>> 5 SymmEig 2.932862 3.320 2.56 0.75
>> 2 PivQR 7.891343 8.933 8.26 0.67
>> 8 arma 7.897527 8.940 8.64 0.02
>> 6 QR 8.679329 9.825 9.07 0.74
>> 1 lm.fit 13.082155 14.809 13.38 1.40
>> 4 SVD 64.440813 72.947 71.86 1.01
>> 9 GSL 156.208481 176.828 175.86 0.80
>>
>> The lm.fit timing is from the native R implementation based on a
>> modified QR decomposition from Linpack (hence using only level-1
>> BLAS). The reason for using Linpack is because we want a specific
>> column pivoting scheme, which is to retain the original column order
>> except in the case or apparent singularity when the current pivot
>> column is moved to the right hand side via a cyclic shift.
>>
>> Strictly speaking the lm.fit results should only be compared to
>> rank-revealing decompositions of which, at present, only the PivQR
>> method based on the colPivHouseholderQr method is a rank-revealing
>> implementation. However, the SymmEig and SVD methods can and will be
>> adapted. I think that the LDLt method can also be adapted even though
>> the pivoting scheme doesn't guarantee decreasing magnitudes of the
>> elements of the diagonal vector D.
>>
>> The RcppGSL implementation uses the gsl_multifit_linear function that
>> is based on a singular value decomposition. The RcppArmadillo
>> implementation uses the arma::solve function for overdetermined
>> systems. It is based on the Lapack subroutine dgels.
>>
>> On another machine using the reference BLAS I get
>>
>> lm benchmark for n = 100000 and p = 40: nrep = 20
>> test relative elapsed user.self sys.self
>> 3 LDLt 1.000000 1.593 1.591 0.000
>> 7 LLt 1.001255 1.595 1.592 0.001
>> 5 SymmEig 2.787822 4.441 3.674 0.758
>> 6 QR 6.760829 10.770 10.010 0.752
>> 2 PivQR 6.914626 11.015 10.225 0.777
>> 1 lm.fit 9.943503 15.840 14.288 1.537
>> 8 arma 13.971751 22.257 22.202 0.010
>> 4 SVD 41.958569 66.840 65.609 1.158
>> 9 GSL 162.653484 259.107 257.944 0.932
>>
>> Bumping up the size to 1 000 000 by 100 I get, on the machine with atlas,
>>
>>> do_bench(1000000, 100, 10)
>> lm benchmark for n = 1e+06 and p = 100: nrep = 10
>> test relative elapsed user.self sys.self
>> 3 LDLt 1.000000 25.605 25.49 0.10
>> 7 LLt 1.047491 26.821 26.70 0.10
>> 5 SymmEig 1.882054 48.190 66.61 10.05
>> 6 QR 7.443937 190.602 181.92 8.56
>> 8 arma 8.933138 228.733 224.29 4.22
>> 2 PivQR 8.981644 229.975 221.70 8.11
>> 1 lm.fit 12.003163 307.341 272.16 34.91
>> 4 SVD 78.411873 2007.736 2012.31 21.72
>> 9 GSL 210.777465 5396.957 5382.61 10.36
>>
>> and on the machine with the reference BLAS
>>
>>
>>> do_bench(1000000L, 100L, 20L, TRUE)
>> lm benchmark for n = 1000000 and p = 100: nrep = 20
>> test relative elapsed user.self sys.self
>> 3 LDLt 1.000000 69.297 69.096 0.174
>> 6 LLt 1.039295 72.020 71.750 0.203
>> 4 SymmEig 2.828102 195.979 176.954 18.961
>> 5 QR 8.990534 623.017 603.931 18.530
>> 2 PivQR 12.731807 882.276 863.076 18.474
>> 1 lm.fit 17.585884 1218.649 1133.541 84.174
>> 7 arma 30.593186 2120.016 2108.481 10.087
>>
>>
>>
>
>
>