Re: [eigen] No vectorization in presence of .cast<T>() calls |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
On 18.12.2010 06:23, Benoit Jacob wrote:
> Tough case.
>
> Casting from unsigned char to float is expanding 1 byte to 4 bytes,
> which means going from 16 to 4 scalars per 16-byte packet. This change
> in the number of scalars per packet is what's troublesome for our
> vectorization system.
>
> In general, that's quite hard, but it seems that we can easily
> overcome this in you particular case. Since you're only casting from a
> small type to a bigger type, the expression returned by cast() could
> be vectorizable by implementing packet() by reading LESS THAN a packet
> from the original uchar expression, and expanding it to float.
>
> something like this (pseudo code):
>
> packet4f cast<float>(Index i)
> {
> return packet4f(float(src.coeff(i)), float(src.coeff(i+1)),
> float(src.coeff(i+2)), float(src.coeff(i+3)));
> }
Hm, this would use a slow unvectorized FPU-Cast, I guess ...
> This is only going to be beneficial if this is used in a complex
> enough expression to pay for the cost of this packet() method. We must
> make sure not to introduce a performance regression on a simple
> dst=src.cast<float>() example.
Just thinking loud here ...
First of all, I'd say that casting from a bigger to smaller type should
be possible with the current vectorization system. Something like this:
packet4f some_double_expression::cast<float>(Index i){
return swizzle(
_mm_cvtpd_ps(src.packet(i)),
_mm_cvtpd_ps(src.packet(i+2)),
/* some index set here */);
}
For the other way around I admit that it's very tricky. Especially, if
the smaller type also comes from a complex expression. Ideally you would
need to read a single package of the smaller type and return 2, 4 or 8
consecutive packages of the bigger type -- I have no idea how this could
be done easily and efficient ...
Another point: Did someone already think about how to support AVX in the
future? -- Last time I checked there were some hard-coded 16 in the code ....
Christoph
--
----------------------------------------------
Dipl.-Inf. Christoph Hertzberg
Cartesium 0.051
Universität Bremen
Enrique-Schmidt-Straße 5
28359 Bremen
Tel: (+49) 421-218-64252
----------------------------------------------