Hi Eigen Folks,
First, thanks for the great library. We're using it in our machine learning library DyNet to great success.
I had a quick question about something that seems like it should be possible, but I haven't found a reference. I currently have code here:
That implements the "Adam" update rule for stochastic gradient descent found in this paper:
Here, all places with "tvec()" are Eigen one-dimensional Tensors. The thing that bugs me here is that I'm calling 4 different operations, which results in 4 different GPU kernel launches, for an operation that is inherently componentwise. If possible, I'd like to be able to basically create a single functor that takes 4 floats, and modifies them appropriately, then pass this in a single GPU operation.
I know this is possible using binaryExpr() for binary expressions, but I couldn't find it for operations with a larger number of arguments. Is there any chance that there is an elegant way to do this within Eigen (i.e. without writing my own kernel)?