Re: [eigen] Using Eigen in CUDA kernels

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi Benjamin,

On Mon, Feb 11, 2013 at 2:10 PM, Benjamin Schindler
<bschindler@xxxxxxxxxxx> wrote:
> Do you plan on merging this to master at some point?

sure, but let's make sure we have something quite complete with proper
unit tests before asking for a merge.

> "assembly looks reasonable" - meaning, somewhat optimal, could be
> better? Do you have some specifics?

the problem is that the actual register allocation seems to take place
after the .ptx files, so I can only tell that on the exemples I
tested, only meaningful instructions have been generated but I cannot
tell the register use is good or not.

gael

>
> Regards
> Benjamin
>
> On 02/07/2013 07:21 PM, Gael Guennebaud wrote:
>> Hi list,
>>
>> good news, NVCC (the CUDA compiler from NVidia) version 5 is now able
>> to parse and compile Eigen :) That means using Eigen in our cuda
>> kernels is "just" a matter of qualifying all our functions callable
>> from CUDA by __device__ when compiled with nvcc. As a proof of concept
>> I've created a fork there:
>>
>> https://bitbucket.org/ggael/eigen-nvcc
>>
>> This port is far to be complete, but I've been able to do stupid stuff like:
>>
>> Matrix3f M(data1+tid*9);
>> Vector3f v = M.inverse() * Map<Vector3f>(data+tid*3);
>> output[tid] = v.cwiseAbs().maxCoeff();
>>
>> where data1, data, and output are device buffers. The generated
>> assembly looked reasonable.
>>
>> So no more crappy home-made fixed-size matrix and vector classes!
>>
>> This also means that evaluating very large *coefficient-wise
>> expressions* on CUDA should not be very difficult. We just need a
>> special DeviceMatrix storing its data in device memory, and specialize
>> its assignment operator to call a CUDA kernel templated with the
>> destination and source expression...
>>
>>
>> have fun,
>> Gael.
>>
>>
>
>
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/