Re: [eigen] arm64 support

I'm happy to report that I finally have a working arm64 implementation of PacketMath that passes all of the packetmath unit tests.. I've tested it with the LLVM compiler in Xcode 5, running on the Apple A7 processor in an iPhone 5s. This is the only arm64 toolchain I have access to.

The biggest new feature of the NEON instruction for arm64 is double precision, but there are a few other nice additions:

- A vector DIV instruction (instead of the reciprocal estimate / step sequence)

- A vector SQRT instruction (32-bit NEON only has instructions for estimating the reciprocal square root)

- Min and max vector element instructions (instead of pairwise reduction)

What's the preferred way to share this code? The changes are contained in a new PacketMath64.h header, plus a two line addition to the Core header to include PacketMath64.h when __aarch64__ is defined.

--Chris

On Fri, Sep 13, 2013 at 12:24 AM, Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:

Hi Chris,

you're very welcome to extend NEON support. Indeed, the only place to look at is src/Core/arch/NEON/PacketMath.h. I guess it means changing the number of registers and add all the wrappers for double. Perhaps it would be better to put them in a new file instead of a big #ifdef / #endif. This file would be conditionally included from Eigen/Core. You might also look at the arch/SSE/PacketMath.h file to have SSE examples of the double wrappers. The expected behaviors are documented in src/Core/GenericPacketMath.h. Don't hesitate to ask us if something is unclear. Compiling and running the packetmath_* unit tests should be enough to verify everything works well.

good luck,

gael

On Fri, Sep 13, 2013 at 2:57 AM, Chris Laurel <claurel@xxxxxxxxx> wrote:

Has anyone started work on supporting NEON for arm64? It appears that the number of registers has doubled from armv7 and there is now support for double precision SIMD instructions.

I'm interested in participating in development of this. However, while I have extensive experience using the Eigen library, I've never messed around with the lowest level internals. It looks like the place to start (and perhaps the only file to modify) would be src/Core/arch/NEON/PacketMath.h. Some of the required changes seem self-evident, but if there's some sort of guide on adding a new SIMD to Eigen, I'd love to know about it.

Best regards,

Chris