Re: [eigen] Replacing AVX code with Eigen PacketMath ?

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


There is not much documentation, but you could try activating the `EIGEN_INTERNAL_DOCUMENTATION` cmake-flag and build the documentation. But probably it is easier to read the sources directly.

Right now, vectorized comparisons are really not that well integrated into Eigen-Core (most packet-functions like `pcmp_eq` exists, though). Also, shuffling works quite different between architectures, so this may require individual hand-optimizing anyway.

Btw: Especially if you have AVX2, I think your code would be more efficient, if instead of shuffling `b`, you broadcast each element of `rhs` to a vector and compare that to `a` -- then `vpor` the intermediate results and just do a single `vptest` at the end:
https://godbolt.org/z/VEBKa3

Cheers,
Christoph


On 17/01/2020 15.35, Bogdan Burlacu wrote:
Hi,

I have a project where I'm already using Eigen for computations. I was
wondering if I can rewrite some hand-written AVX2 code using Eigen types
(I'm looking at Packets in particular):

          constexpr int shift_one { _MM_SHUFFLE(0, 3, 2, 1) };
          constexpr int shift_two { _MM_SHUFFLE(1, 0, 3, 2) };
          constexpr int shift_thr { _MM_SHUFFLE(2, 1, 0, 3) };

          static inline bool _mm256_is_zero(__m256i m) noexcept { return
_mm256_testz_si256(m, m); }

          static inline bool NullIntersectProbe(uint64_t const* lhs,
uint64_t const* rhs) noexcept
          {
              __m256i a { _mm256_load_si256((__m256i*)lhs) };
              __m256i b { _mm256_load_si256((__m256i*)rhs) };

              __m256i r0 { _mm256_cmpeq_epi64(a, b) };
              if (!_mm256_is_zero(r0))
                  return false;

              __m256i r1 { _mm256_cmpeq_epi64(a,
_mm256_permute4x64_epi64(b, shift_one)) };
              if (!_mm256_is_zero(r1))
                  return false;

              __m256i r2 { _mm256_cmpeq_epi64(a,
_mm256_permute4x64_epi64(b, shift_two)) };
              if (!_mm256_is_zero(r2))
                  return false;

              __m256i r3 { _mm256_cmpeq_epi64(a,
_mm256_permute4x64_epi64(b, shift_thr)) };
              return _mm256_is_zero(r3);
          }

I would like to make the code more generic so that it would work on
other x86_64 CPU flavors.
Is there any documentation for this low-level stuff or should I just
look at eg PacketMath.h?

Best,
Bogdan





--
 Dr.-Ing. Christoph Hertzberg

 Besuchsadresse der Nebengeschäftsstelle:
 DFKI GmbH
 Robotics Innovation Center
 Robert-Hooke-Straße 5
 28359 Bremen, Germany

 Postadresse der Hauptgeschäftsstelle Standort Bremen:
 DFKI GmbH
 Robotics Innovation Center
 Robert-Hooke-Straße 1
 28359 Bremen, Germany

 Tel.:     +49 421 178 45-4021
 Zentrale: +49 421 178 45-0
 E-Mail:   christoph.hertzberg@xxxxxxx

 Weitere Informationen: http://www.dfki.de/robotik
  -------------------------------------------------------------
  Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
  Trippstadter Straße 122, D-67663 Kaiserslautern, Germany

  Geschäftsführung:
  Prof. Dr. Antonio Krüger (Vorsitzender)
  Dr. Walter Olthoff

  Vorsitzender des Aufsichtsrats:
  Dr. Gabriël Clemens
  Amtsgericht Kaiserslautern, HRB 2313
  -------------------------------------------------------------




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/