Re: [eigen] Using Eigen in CUDA kernels

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen <eigen@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [eigen] Using Eigen in CUDA kernels
From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
Date: Tue, 12 Feb 2013 09:33:03 +0100
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=S4yqJGi1ZnKzdCvpoIjsKNhSpJ2L63KYTkBmBkqr7gc=; b=X5dgPmUc7yKHF/tum1tdYsVWFgyRB+37la/vYcoOPjotkq9etKWRD+bN0tTOAExU7G RJDRueNj04zSfOWD9WLVVcWhFc2+n9gw8MWwggjHlIXzcOpB+G0fynV0KkyvXA+aD6s/ VGKwKG2zmX0/BQ40wShNwt1Y1xQ2DTb4Pp8rAyUTJED35sSC99ZcHSb3NYsJW96nRWFV 37etyJU9d2rsQrEbhlefwNkmXBqh+Tvhpq/5PmzLa2P5orYuLKPwGHgoErwef9H+WnY6 ZJCUkaTFTl48H9of6t63IxXrg/adlsu/V62AUI0vnPat/mbqaox+4r8VvP5U2ntw+11M tKJQ==

Hi Benjamin,

On Mon, Feb 11, 2013 at 2:10 PM, Benjamin Schindler
<bschindler@xxxxxxxxxxx> wrote:
> Do you plan on merging this to master at some point?

sure, but let's make sure we have something quite complete with proper
unit tests before asking for a merge.

> "assembly looks reasonable" - meaning, somewhat optimal, could be
> better? Do you have some specifics?

the problem is that the actual register allocation seems to take place
after the .ptx files, so I can only tell that on the exemples I
tested, only meaningful instructions have been generated but I cannot
tell the register use is good or not.

gael

>
> Regards
> Benjamin
>
> On 02/07/2013 07:21 PM, Gael Guennebaud wrote:
>> Hi list,
>>
>> good news, NVCC (the CUDA compiler from NVidia) version 5 is now able
>> to parse and compile Eigen :) That means using Eigen in our cuda
>> kernels is "just" a matter of qualifying all our functions callable
>> from CUDA by __device__ when compiled with nvcc. As a proof of concept
>> I've created a fork there:
>>
>> https://bitbucket.org/ggael/eigen-nvcc
>>
>> This port is far to be complete, but I've been able to do stupid stuff like:
>>
>> Matrix3f M(data1+tid*9);
>> Vector3f v = M.inverse() * Map<Vector3f>(data+tid*3);
>> output[tid] = v.cwiseAbs().maxCoeff();
>>
>> where data1, data, and output are device buffers. The generated
>> assembly looked reasonable.
>>
>> So no more crappy home-made fixed-size matrix and vector classes!
>>
>> This also means that evaluating very large *coefficient-wise
>> expressions* on CUDA should not be very difficult. We just need a
>> special DeviceMatrix storing its data in device memory, and specialize
>> its assignment operator to call a CUDA kernel templated with the
>> destination and source expression...
>>
>>
>> have fun,
>> Gael.
>>
>>
>
>
>

References:
- [eigen] Using Eigen in CUDA kernels
  - From: Gael Guennebaud
- Re: [eigen] Using Eigen in CUDA kernels
  - From: Benjamin Schindler

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Using Eigen in CUDA kernels
Next by Date: [eigen] Documenting functions which return an Matrix
Previous by thread: Re: [eigen] Using Eigen in CUDA kernels
Next by thread: Re: [eigen] Using Eigen in CUDA kernels

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/