[eigen] Trouble using ARM NEON vector instructions on Android

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi all,

I am cross-compiling lots of Eigen code to ARM on Android, and for some reason I am not getting vector instructions.

Versions:
I've attached a trivial code snippet, which adds 4-element float vectors in 3 ways:
  • vaddq_f32 intrinsic. Compiles to vadd.f32 instruction.
  • Eigen's internal padd() on Packet4f's. Also compiles to vadd.32f.
  • Adding Eigen::Vector4f's. Compiles to four fadds instructions!
I've also attached the assembly output. Full command for generating it yourself (with NDK r8e installed) is at the top of the source file.

I checked that EIGEN_VECTORIZE_NEON is defined, and Eigen::SimdInstructionSetsInUse() returns "ARM NEON".

Any ideas? I know other people on this list are using Eigen on ARM. It looks like operator+ for Vector4f is not dispatching to the Packet4f specializations properly, but I'm not current enough on that infrastructure to understand why.

Cheers,
Patrick
/*
To compile, set $NDK (path to Android NDK root directory) and $EIGEN3_DIR, then:
$NDK/toolchains/arm-linux-androideabi-4.6/prebuilt/linux-x86_64/bin/arm-linux-androideabi-g++ --sysroot=$NDK/platforms/android-14/arch-arm -march=armv7-a -mfloat-abi=softfp -mfpu=neon -mthumb -O2 -I$EIGEN3_DIR -isystem $NDK/platforms/android-14/arch-arm/usr/include -isystem $NDK/sources/cxx-stl/gnu-libstdc++/4.6/include -isystem $NDK/sources/cxx-stl/gnu-libstdc++/4.6/libs/armeabi-v7a/include -S add_4f32.cpp -o add_4f32.cpp.s
*/

#include <Eigen/Dense>

Eigen::Vector4f add_vector(const Eigen::Vector4f& v, const Eigen::Vector4f& w)
{
  EIGEN_ASM_COMMENT("begin vector");
  Eigen::Vector4f u = v + w;
  EIGEN_ASM_COMMENT("end vector");
  return u;
}

Eigen::internal::Packet4f add_packet(Eigen::internal::Packet4f v,
                                     Eigen::internal::Packet4f w)
{
  EIGEN_ASM_COMMENT("begin packet");
  Eigen::internal::Packet4f u = Eigen::internal::padd(v, w);
  EIGEN_ASM_COMMENT("end packet");
  return u;
}

float32x4_t add_intrinsic(float32x4_t v, float32x4_t w)
{
  EIGEN_ASM_COMMENT("begin intrinsic");
  float32x4_t u = vaddq_f32(v, w);
  EIGEN_ASM_COMMENT("end intrinsic");
  return u;
}

Attachment: add_4f32.cpp.s
Description: Binary data



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/