[eigen] on the vectorization of comparisons and related stuff

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi list,

I have been thinking about the vectorization of comparisons operators
that is actually not as trivial as it looks like. Actually a
coefficient-wise comparison is never used for its own, and we can
distinguish the three following use cases:

1 - after the comparison we perform a "all" (or "any") *reduction* to
know if all (or at least one) comparisons vanish. eg:
      (p.cwise() > infCorner).all() && (p.cwise() < supCorner).all()
    that tests if p is inside the box defined by its inf/sup corners.

2 - the comparison is used in a *selection*:
      R = (M.cwise() < 0).select(A,B);
    that is equivalent to:
     for each coeff i,j do
         R(i,j) = M(i,j) < 0 ? A(i,j) : B(i,j)

3 - the third use case is to combine two comparisons with the "&&" and
"||" operators before doing a reduction or a selection.
    Note that currently we do not support those operators, and they
are really useful only to perform a selection since for a reduction we
can always perform the reduction on each side and combine the result
with a scalar "&&" or "||" as in the previous example.

So what is the issue with the vectorization ? Well the problem is that
the SSE instruction set does not provide any support for conditional
moves (nor conditional evaluation !)... at least for now since SSE4
will provide such a feature (if I remember well). So currently you
have to mimic the ?: operator using bitwise operations and the SSE
comparison instructions have been designed for this purpose. So to be
more precise a SSE comparison between two packets of 4 floats returns
a packet of 4 ints where each coeff is either 0x0 or 0xFFFFFFFF so
that you can directly use the result for a bitwise selection:
  mask = cmplt(A,B);
  res = or(and(mask,C), notand(mask,D));
which mimics:
  res = A<B ? C : D;

This convention is also suitable for boolean operations as well as reductions.

The main consequence is that the comparisons operators should not
return a matrix expression of bool but rather a matrix expression of
int (for float) or int64 (for double). If everything is vectorized why
not, but:
 - how to mix that with the scalar path ?? (in general the boundary of
a matrix is processed using scalar instructions)
 - with such a strategy both the "then" and "else" sides of a
selection have to be evaluated which might be a problem if the
selection is used, for instance, to avoid divisions by zero or to skip
expensive stuff.
 - in a selection, this also implies that the scalar type of the
matrices used for the comparison must have the same size than the
scalar type of the returned matrices...
 - and there are probably several other minor issues...

So all in all, it seems to be quite difficult to come up with
something easy to use, robust and efficient. Perhaps, one option would
be to introduce new comparisons operators which would returns bit
masks rather than booleans, eg.:

 m.cwise().maskLessThan(0).blabla

and only the mask* (and the follow up selections/reductions) would be
vectorized and it is up to the user to choose what is better for his
use case ?

what do you think ? (I'm posting it now but there is no hurry to solve
this issue for the 2.0 release !)

gael.



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/