Re: [eigen] two technical points: WithAlignedOperatorNew and std::complex casting

[ Thread Index | Date Index | More Archives ]

2008/12/31 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> Perhaps I forgot to mention that in my mind, it is perfectly fine that
> a lib/app compiled with vectorization don't have to be compatible with
> a lib compiled with vectorization. It is just like many other GCC
> options which break binary compatibility if both binaries are not
> compiled with the same options.

If it weren't possible to have vectorized and not-vectorized code
compatible with one another, then we could comfort ourselves with this
argument. Meanwhile, it happens to be possible, so we really want
that! There are at least 2 good reasons to try hard:
1) an app could have a vectorized path and a non-vectorized path,
choosing between them at runtime depending on the CPU capabilities
(think 32bit x86). In that case one can imagine a situation where the
core of the app is compiled without vectorization, creates a Eigen
objects, passes it to the vectorized part.
2) Suppose that SomeBinaryLibrary deals with Eigen objects.
SomeProgram uses SomeBinaryLibrary. If we don't guarantee
compatibility between vec and non-vec, then either both or neither of
SomeBinaryLibrary and SomeProgram must enable vectorization. If
SomeBinaryLibrary and SomeProgram are built by disconnected groups of
people then they might pull their hair for a long time before finding
why SomeProgram keeps crashing!

> In spite of what I said, eventually I'm not against to enforce
> alignment everywhere but your example motivate me to add a "NoAlign"
> or "Compact" flag for the fourth template parameter of Matrix<> such
> that it is possible to use eigen to create compact structure at the
> cost of no vectorization. Usage example:
> typedef Matrix<float,4,1,NoAlign> Vector4Nf;
> typedef Matrix<float,4,4,RowMajor|NoAlign> RowMatrix4Nf;
> Of course this complexify a bit the use of Eigen, but this seems to be
> a very important feature to me (an example among many others: to
> create structures compatible with nvidia CUDA compiler)

Good idea, I see the point. Something to investigate: instead of a new
flag, maybe just forcibly unset the PacketAccessBit at the moment of
computing the matrix flags. So hereafter we don't need to mess with
one more flag.



Mail converted by MHonArc 2.6.19+