On Fri, Dec 17, 2010 at 4:13 PM, Christoph Hertzberg
<chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Ok, that was just an uneducated guess. Next guess: Could it be that
currently unsigned char is not vectorized at all? I browsed the source a
bit and just found packet4i, packet4f and packet2d.
Have you tried if ArrayXXf::cast<double>() etc gets vectorized?
It is true that "unsigned char" is not vectorized at all and thus it is expected, that we don't see any vectorization.
Casting still prevents vectorization for other types which are vectorized. I created a temporary double Matrix and cast then from double to float but the issue remains the same.
I think you still lose a bit here with the unnecessary assignment.
Optimal would be something like:
* Unpack 4 bytes of memory to register taking care of unsignedness
(I don't know the instruction for that, but I'm sure there is one or
a combination of some),
* Convert it (_mm_cvtepi32_ps) continue working with that and
* Store register to memory.
I agree, but I am still convinced that this is tricky or at least tedious because we would need special functions which fill a target Packet from the source type from which we want to cast. It is at least nothing that will happen before 3.0 :)
What is actually bad with these conversions is that some of them need to be rewritten. E.g. _mm_cvtpi8_ps is not available on MSVC 2010 in 64 bit builds.
Thanks for the feedback,
- Hauke