Re: [eigen] Signed or unsigned indexing

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: eigen@xxxxxxxxxxxxxxxxxxx
Subject: Re: [eigen] Signed or unsigned indexing
From: fayard@xxxxxxxxxxxxx
Date: Sun, 22 Jan 2017 01:01:05 +0100

On 2017-01-21 22:56, Gael Guennebaud wrote:

That's easy to check, I simply measured:

EIGEN_DONT_INLINE
void foo(MatrixXf &A, MatrixXf &B, Index i, Index j, Index n, Index m)
{
  A.block(i, j, n, m) += B.block(i+1, j+1, n, m);
}

and got 15% of speed difference between int and ptrdiff_t using either
gcc6 or clang3.9.


I have just tested this kind of code on a 32x32 matrix, and the version
using 64-bit integers is indeed 25% faster than the version using 32-bit
integers with Eigen.

But if I code the same thing in plain C++ (without Eigen), with forloops,

I get exactly the same timings with 32-bit integers and 64-bit integers

with both the Intel compilers and clang. With gcc, the version with32-bit

integers is 4% faster (see codes below).

My feeling is the fact that Eigen is written with intrinsics turns offmany

compiler optimizations. Therefore, the difference in between 32-bit and
64-bit integers is very Eigen-specific.

If there was such a difference in between 32-bit and 64-bit integersindexing

in plain C/C++, nobody would index anymore with ints in C.

I measured the conversion of 1000 integers from 'int' or 'unsigned
int' to 'long' and observe 25% of speed difference using either clang
or gcc (in favour of unsigned int).

My claim was not that there was no difference in between converting intandunsigned to long, but rather that this conversion does not happen asoften asyou might think and almost never makes any difference. For instance, acompiler

might change

void f(double* p, long n) {
  for (int k = 0; k < n; ++k) {
    p[k] += 1.0;
  }
}

to (look at the new type for k)

void f(double* p, long n) {
  for (long k = 0; k < n; ++k) {
    p[k] += 1.0;
  }
}

because overflow for signed integers is undefined behavior. Thistransformationwould be impossible with unsigned integers because of warping behavior.So signedintegers give more flexibility to the compiler. Even though theconversion fromint to long is more costly than the conversion from unsigned to long,the compileris often allowed to remove this conversion either completely or at leastfrom the

inner loop with signed integers.

For 32-bit/64-bit integers, there are some corner cases where one isclearly betterthan the other. This code which tries to find the index of the smallestelement of

an array:

int index_smallest(float* p, int n) {
  float small_value = std::numeric_limits<float>::max();
  int small_k = -1;
  for (int k = 0; k < n; ++k) {
    if (p[k] < small_value) {
      small_value = p[k];
      small_k = k;
    }
  }
  return k;
}

The Intel compiler is smart enough to vectorize this loop (this is areduction). Butas both small_value and small_k have to be carried in vector registers,one couldunderstand that it is more efficient if their type have the same size.This is whathappens. So if your array is of float, use int for indexes. Using 64-bitintegersmight lead to a 2x slowdown. But if your array is made of doubles, it isbetter to

use a 64-bit integer :-)

Francois


PS: Here are the codes used for my benchmark
==== Plain C++
#include <cstddef>

//#define INT std::ptrdiff_t
#define INT int

int main() {
  const int nb_time = 10000000;

  const INT n = 32;
  float* A = new float[n * n];
  float* B = new float[n * n];
  for (INT j = 0; j < n; ++j) {
    for (INT i = 0; i < n; ++i) {
      A[j * n + i] = 0.0f;
      B[j * n + i] = 0.0f;
    }
  }

  for (int k = 0; k < nb_time; ++k) {
    for (INT j = 0; j < n; ++j) {
      for (INT i = 0; i < n; ++i) {
        A[j * n + i] += B[j * n + i];
      }
    }
  }

  return 0;
}
====

==== Eigen
//#define EIGEN_DEFAULT_DENSE_INDEX_TYPE std::ptrdiff_t
#define EIGEN_DEFAULT_DENSE_INDEX_TYPE int

#include <Eigen/Dense>

int main() {
  const int nb_time = 10000000;

  const int n = 32;
  Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic> A(n, n);
  Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic> B(n, n);
  for (int j = 0; j < n; ++j) {
    for (int i = 0; i < n; ++i) {
      A(i, j) = 0.0f;
      B(i, j) = 0.0f;
    }
  }

  for (int k = 0; k < nb_time; ++k) {
    A.block(0, 0, n, n) += B.block(0, 0, n, n);
  }

  return 0;
}
====

References:
- [eigen] Signed or unsigned indexing
  - From: Henrik Mannerström
- Re: [eigen] Signed or unsigned indexing
  - From: Benoit Jacob
- Re: [eigen] Signed or unsigned indexing
  - From: Francois Fayard
- Re: [eigen] Signed or unsigned indexing
  - From: Mark Borgerding
- Re: [eigen] Signed or unsigned indexing
  - From: Martin Beeger
- Re: [eigen] Signed or unsigned indexing
  - From: François Fayard
- Re: [eigen] Signed or unsigned indexing
  - From: Gael Guennebaud
- Re: [eigen] Signed or unsigned indexing
  - From: Francois Fayard
- Re: [eigen] Signed or unsigned indexing
  - From: Gael Guennebaud

Messages sorted by: [ date | thread ]
Prev by Date: [eigen] unsubscribe
Next by Date: [eigen] Head is broken
Previous by thread: Re: [eigen] Signed or unsigned indexing
Next by thread: Re: [eigen] Signed or unsigned indexing

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/