Hello,
I've written (yet another!) Dual Number implementation for automatic differentiation. It is meant to be used as the value-type in Eigen matrices, and has templates for vectorization (shockingly) similar to (and based on) Eigen's complex-type vectorizations. It is quite fast for first-order forward diff, and imho pretty easy to use. There are also SSE/SSE3/AVX vectorizations for std::complex<dual< float | double >> types.
I hope this could be useful for someone and would be glad for any feedback, improvements, etc.
It would be interesting to compare this approach to others, by hand-wavey arguments I believe it should ultimately be faster in certain cases.
Cheers,
Michael