|Re: [eigen] AVX/LRB and SSE with SoA support|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] AVX/LRB and SSE with SoA support
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Thu, 2 Sep 2010 10:56:53 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=UOm1ZS1KyG8u5/vpF9wrisWiB+yHXZVq0T1L7FI9liw=; b=XpvBjuYAKq6puV+yxm5IZvqGtHF05TMXNMiuMv1Kgzc5C+7zhCxvIj9rZbqNtV498f AODaY5DLKkeG6nGKsey+6pKrEZvAtjvr0kL01kvYiqLt9kEyTPh8E437RSzE630eS0S3 zVVwnfMd1xGz8khpcV0ndHT55KFE/8Pystbyg=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=qy6ebwnUxYAAXJQHciGQov3/xne/I8V8cL+UrHI1fE1kwDZ8d1geiw04DqddE0EEY+ d5yqcEobezKTHgqNooNGnH31nKlDdM1gGdxCAPO+iVqeN5iCZPLmYYLWSJx5uCEVgOE9 O8HVN2itWg6/npg/w4SxCiZHW36N/c6480ZKk=
On Wed, Sep 1, 2010 at 9:23 PM, keller <christoph-keller@xxxxxx> wrote:
> On 08/23/2010 08:53 PM, Gael Guennebaud wrote:
>> well, Eigen is a general purpose math library and so such "hacks" does
>> not match well with our objectives. We have to find a better solution.
> If one wants to use the Vector units of the processor and store masks for
> comparisons, the comparison operators cannot return values of type bool.
The problem is that we cannot know in advance whether a comparison
should return bools or bitmasks. Also the nature of the bitmasks
depends on the underlying vectorization engine.
So the idea is to have powerful high level expressions such that there
is no need to expose special scalar types. For instance, in:
c = (a < b).select(x,y);
returns an expression of bool, not a matrix of bool, and so
vectorization is still possible. Then when we call select on it we can
known whether this whole expression can be vectorized. If so we would
generate something like:
c(i) = ei_pselect(ei_plt(a(i),b(i)),x(i),y(i));
where ei_pselect would be implemented in term of pand/pnotand,por for
SSE2 and 3, and using BLENDVP* with SSE4. Note that the BLENDVP*
instructions read only the first bit and are therefore much more
> The only typesafe solution i see here is to introduce types like IntBool,
> FloatBool, DoubleBool etc. (These could be template types like
> EigenBool<type X>...). The size of the template is like the size of X, so it
> can store the mask. One can add functions like toBool(), select and so on.
> One could even add a Conversion-To-Bool expression template ;-)
> I understand this seems to be a lot of work, but the reward is a significant
> increase in speed. In the future the increase will become even bigger,
> because of wider vector registers.
> I did a part of the demo program using plugins, but this is not enough, as
> Eigen has already comparison operators, so one would have to derive in this
> case to avoid changing the original code.
> You could at least take the few Intrinsics specialisations and expression
> templates from the patch.
> Thanks for all your time.