Re: [eigen] Re: [Blitz-devel] Fork Blitz++?

[ Thread Index | Date Index | More Archives ]

On Wed, Jan 20, 2016 at 10:32 AM, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
I was not surprised that Eigen is faster -- mostly as it uses SIMD.
But I think for a fairer comparison, you should at least initialize the values in the Blitz++ implementation (working on uninitialized values tends to be slow, if there is a number of NaNs and Infs in the data), and probably also compare how big the difference without SIMD would be (or with just SSE2/3/4, instead of AVX+FMA).

Initializing the Blitz tensors did not make any difference here (I disabled SSE exception anyways).

So to complete the bench:

Eigen with no vectorization at all (-DEIGEN_DONT_VECTORIZE  -fno-vectorize) : 3.5s
Eigen with SSE2 only: 1s
Eigen with AVX: 0.57s
Eigen with FMA : 0.35s
Eigen with FMA+OpenMP: 0.1s (4 threads)

Blitz: 15s

SIMD is not all, cache friendliness is also very important!


Mail converted by MHonArc 2.6.19+