Re: [eigen] overriding scalar_X_op for auto-dif? |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] overriding scalar_X_op for auto-dif?
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Thu, 19 May 2011 09:24:32 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type:content-transfer-encoding; bh=h/UwyKbdijMoXuJCZHXGwGjRtxH2BGi5FrsBM8toHGw=; b=CmOeTA+TGoNVl56JHdgc69m5DQJYyTguNPEvDm88/Zo6XRNxxrYiGeksxka/XlLlqO h9CId/YBk5lULRlOoavSRxkzQkZjR/4JuQZQiMouas+TYW7A19xfHOeYlWuRD7zd7a/Q GcRP8kqc0eiVMquMFsrBXKlEIpZBnQ8LiVyJk=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=Fd5s931I+yUIXrYhEKtMOZdj8kiuQmNRAak4aw/elmlCL63iksgYrWiAySZMzsPw4I TyIPC8cqiUBBVJm4VO0aNgSA/8HjIyOoKlknK30o/KS8OJg+iPFDp2Uh6zc5Rcqvz54j yYtxNzrKeDiuUe+uPFN3h+Pd9cPULIot3AuFM=
Hi,
my first question is do you really have to overload matrix operators?
I don't know much about reverse-mode autodiff but with forward-mode we
can simply work at the level of scalars. Well, of course for some high
level operations such as inversion, or eigenvalues decomposition it
does make sense to overload at the matrix level, but for addition or
products I'm not sure...
For mixed scalar type products, you have to overload
scalar_product_traits, e.g:
template<> struct scalar_product_traits<Var,double> {
typedef XXXX ReturnType;
};
and provide an operator* between Var and double.
Now, if you really want to specialize scalar_product_op then you can do:
template<> struct scalar_product_op<Var,double> {
typedef XXX result_type;
inline const result_type operator() (const Var& a, const double& b)
const { return a * b; }
};
but if you only change the result_type, then specializing
scalar_product_traits is enough.
Regarding the specialization of sqrt and the likes, you can overload
our template XXX_impl<scalar_type> structure where e.g., XXX is sqrt:
template<>
struct sqrt_impl<Var>
{
static inline const Var run(const Var& x) {
return mysqrt(x);
}
};
You might have a look at unsupported/Eigen/src/Core/AutoDiffScalar.h
for some real-world examples. Note that this AutoDiff module works in
forward mode, and leverage expression templates, whence the relative
complexity of the code. I plane to slightly redesign it for better
performance with sparse derivatives.
I'm very interested by your reverse-mode autodiff stuff and I'd be
glad to give you more specific answers to your issues, but to this end
it would be much easier if you could already share your code.
gael
On Thu, May 19, 2011 at 12:51 AM, Bob Carpenter <carp@xxxxxxxxxxx> wrote:
> We (Matt Hoffman, Michael Malecki, Andrew Gelman and I,
> in the statistics department at Columbia Uni.) have been
> integrating Eigen and our own reverse-mode
> automatic differentiation and have run into a bit
> of a puzzler. It might be solved if I could
> provide a template override for
>
> Eigen::internal::scalar_product_op(T1 a, T2 b);
>
> and its brethren and have this work out properly
> with higher-level calls to matrix ops. For instance,
> T1 might be double and T2 an auto-dif variable, and
> I'd want to be able to perform matrix multiplications
> when one matrix held doubles and the other auto-dif vars.
>
> Here's the longer story to provide some context.
>
> With only auto-dif vars in the matrices, Eigen 3
> actually runs pretty much out of the box.
>
> I did wind up modifying the Eigen code itself to remove
> references to std:: in calls such as std::sqrt().
> I couldn't override them that way. The problem is that
> I'm overloading sqrt(), but can't extend std (if I
> understand C++ properly -- I've only been at it since
> January).
>
> I've unit tested all the arithmetic ops, determinants, inverses,
> eigenvalues, and the lu and llt decompositions and
> all the gradients and values pass. (I'm going to send
> some comments on your doc for NumTraits in a separate
> message.)
>
> Where I'm running into problems is when I want to
> mix double values and auto-dif vars. For convenience,
> let var be our auto-dif var. We override arithmetic
> for it in the usual way, such as:
>
> var operator+(var a, double b);
>
> For a first pass, I did the same thing with Eigen matrices
> (and vectors):
>
> Matrix<var,Dynamic,Dynamic>
> operator*(const Matrix<var,Dynamic,Dynamic>& a,
> const Matrix<double,Dynamic,Dynamic>& b);
>
> This also works fine. My implementation for now just
> promotes the double values to auto-dif variables and
> then leans on Eigen's templating. This isn't the most
> efficient way to do things, but I can optimize later
> with the right interface in place by propagating my
> own derivatives for many of the matrix ops.
>
> Perhaps not surprisingly for the Eigen developers, I run
> into problems with transpose:
>
> Matrix<var,Dynamic,Dynamic> a = ...;
> Matrix<double,Dynamic,Dynamic> b = ...;
> Matrix<var,Dynamic,Dynamic> = a.transpose() * b;
>
> The problem is neatly summarized in the last two error
> messages from the compiler (gcc 4.2 on Mac OSX
> for what it's worth):
>
> ----------------------
> ...
> lib/Eigen/src/Core/Product.h:590: instantiated from ‘const typename
> Eigen::ProductReturnType<Derived, OtherDerived,
> Eigen::internal::product_type<Derived,OtherDerived>::value>::Type
> Eigen::MatrixBase<Derived>::operator*(const
> Eigen::MatrixBase<OtherDerived>&) const [with OtherDerived =
> Eigen::Matrix<stan::agrad::var, -0x00000000000000001, 1, 0,
> -0x00000000000000001, 1>, Derived = Eigen::Transpose<Eigen::Matrix<double,
> -0x00000000000000001, 1, 0, -0x00000000000000001, 1> >]’
>
> src/test/agrad/agrad_eigen_test.cpp:178: instantiated from here
> lib/Eigen/src/Core/CwiseBinaryOp.h:188: error: no match for call to ‘(const
> Eigen::internal::scalar_product_op<double, stan::agrad::var>) (const double,
> const stan::agrad::var&)’
> ---------------------
>
> Eigen's use of expressions and views leads to new classes
> and hence new operations being introduced, such as
> multiplying a transposed double matrix and auto-dif variable matrix.
> These, I haven't overridden.
>
> What I was wondering is whether I could get away with overriding
> the scalar_product_op() and related functions instead of the
> matrix operations. This would be ideal from both an
> efficiency and simplicity point of view.
>
> If it makes matters easier, we're happy to restrict ourselves
> to dynamic dimensions and matrices containing either auto-dif
> variables or double-precision floating point (but not both).
>
> - Bob Carpenter
> Columbia University, Dept. of Statistics
>
> PS: Our project is an implementation of Hamiltonian Monte
> Carlo sampling from unnormalized probability densities.
> It overcomes issues with correlated parameters nicely,
> but requires gradient calcs. In our case, these are
> typically Bayesian posteriors from multilevel regression
> models.
>
> PPS: Thanks for the LGPL license! We're planning on a BSD license
> and will release our code as soon as we have a working and
> tested first version (the project has two years of funding,
> but we hope to release a first version soon, as in a month
> or two). I'm happy to share real code if that'd help.
>
> PPPS: The auto-dif is standalone (in the no-dependencies sense)
> and we'd be happy to contribute whatever we do for integration
> for your experimental pages.
>
>
>
>