Re: [eigen] two technical points: WithAlignedOperatorNew and std::complex casting

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


2008/12/31 Gael Guennebaud <gael.guennebaud@xxxxxxxxx>:
> Perhaps I forgot to mention that in my mind, it is perfectly fine that
> a lib/app compiled with vectorization don't have to be compatible with
> a lib compiled with vectorization. It is just like many other GCC
> options which break binary compatibility if both binaries are not
> compiled with the same options.

If it weren't possible to have vectorized and not-vectorized code
compatible with one another, then we could comfort ourselves with this
argument. Meanwhile, it happens to be possible, so we really want
that! There are at least 2 good reasons to try hard:
1) an app could have a vectorized path and a non-vectorized path,
choosing between them at runtime depending on the CPU capabilities
(think 32bit x86). In that case one can imagine a situation where the
core of the app is compiled without vectorization, creates a Eigen
objects, passes it to the vectorized part.
2) Suppose that SomeBinaryLibrary deals with Eigen objects.
SomeProgram uses SomeBinaryLibrary. If we don't guarantee
compatibility between vec and non-vec, then either both or neither of
SomeBinaryLibrary and SomeProgram must enable vectorization. If
SomeBinaryLibrary and SomeProgram are built by disconnected groups of
people then they might pull their hair for a long time before finding
why SomeProgram keeps crashing!

> In spite of what I said, eventually I'm not against to enforce
> alignment everywhere but your example motivate me to add a "NoAlign"
> or "Compact" flag for the fourth template parameter of Matrix<> such
> that it is possible to use eigen to create compact structure at the
> cost of no vectorization. Usage example:
>
> typedef Matrix<float,4,1,NoAlign> Vector4Nf;
> typedef Matrix<float,4,4,RowMajor|NoAlign> RowMatrix4Nf;
>
> Of course this complexify a bit the use of Eigen, but this seems to be
> a very important feature to me (an example among many others: to
> create structures compatible with nvidia CUDA compiler)

Good idea, I see the point. Something to investigate: instead of a new
flag, maybe just forcibly unset the PacketAccessBit at the moment of
computing the matrix flags. So hereafter we don't need to mess with
one more flag.

Cheers,
Benoit

---


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/