Re: [eigen] When to mark EIGEN_DEVICE_FUNC

[ Thread Index | Date Index | More Archives ]

On Tue, Jan 13, 2015 at 6:12 PM, Chen-Pang He <jdh8@xxxxxxxxxxxxxx> wrote:
By "dynamic memory allocation", do you mean something like
dynamic matrix multiplication which produces temp?

I mean functions which when being compiled will hit a malloc! Matrix products are safe as long as the sizes are known at compile-time. We already took care at bypassing the heavy matrix product implementation code when CUDA is enabled.

By the way, can EIGEN_DEVICE_FUNC functions call
non-EIGEN_DEVICE_FUNC functions?  (I guess no.)

Indeed, if you do so, then the result will be undefined. Sometimes, the compiler (nvcc) does not even complain! Very strange.

For example,

MatrixXd foo(const MatrixXd& A, const MatrixXd& B)
    return A * A.transpose() + B * B.transpose();

Can this function be marked as EIGEN_DEVICE_FUNC?

No because you explicitly requested a dynamic memory allocation since you return a MatrixXd. On the other hand, the following should be fine:

void foo(const Ref<const MatrixXd> A, const Ref<const MatrixXd> B)
  Map<MatrixXd>(data, A.rows(), A.cols()).noalias() = A * A.transpose();
  Map<MatrixXd>(data, A.rows(), A.cols()).noalias() += B * B.transpose();

where data reference a preallocated device bufffer and foo will typically be called on Map<> objects or fixed-size matrices.

I hope that helps!


Thanks again,

On Tue, Jan 13, 2015 at 09:29:38, Gael Guennebaud wrote:
> Hi,
> the idea is to declare all functions/methods with EIGEN_DEVICE_FUNC except
> the ones that potentially lead to non CUDA compatible code. Typically, this
> includes dynamic memory allocation and paths designed for large matrices.
> cheers,
> gael
> On Tue, Jan 13, 2015 at 6:28 AM, Chen-Pang He <jdh8@xxxxxxxxxxxxxx> wrote:
> > Recently, I got 2 nvidia cards to write some CUDA.  I need help
> > to determine what functions should be qualified with
> > EIGEN_DEVICE_FUNC and what should not.  Is there a rule of thumb?
> >
> > Thanks,
> > Chen-Pang

Mail converted by MHonArc 2.6.19+