Hi list,

good news, NVCC (the CUDA compiler from NVidia) version 5 is now able
to parse and compile Eigen :) That means using Eigen in our cuda
kernels is "just" a matter of qualifying all our functions callable
from CUDA by __device__ when compiled with nvcc. As a proof of concept
I've created a fork there:


This port is far to be complete, but I've been able to do stupid stuff like:

Matrix3f M(data1+tid*9);
Vector3f v = M.inverse() * Map<Vector3f>(data+tid*3);
output[tid] = v.cwiseAbs().maxCoeff();

where data1, data, and output are device buffers. The generated
assembly looked reasonable.

So no more crappy home-made fixed-size matrix and vector classes!

This also means that evaluating very large *coefficient-wise
expressions* on CUDA should not be very difficult. We just need a
special DeviceMatrix storing its data in device memory, and specialize
its assignment operator to call a CUDA kernel templated with the
destination and source expression...

have fun,

