|Re: [eigen] Tiny matrix in Eigen2|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Tiny matrix in Eigen2
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Thu, 17 Sep 2009 11:45:45 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=oHyQO7wLQgUWekNg5iMqNcEKtguEkBhojUOij++vS70=; b=UgeyhkysdRwHaBPE5Bu9bELEXZNX7wPzQV0oblAWTMORNabIALkjNYqszlwng+EmJw vo3T+hukH6eJfMZkEbucuid1tfIdBBH+HWSh7fmKiIVyRewpgIDnMvq1BJOBi/l7sjXR lBEWPDnno9JsSU386poOCqw0Lik3maoPhaQfI=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=OJirK7VFeQP0bq9ASYyqHeFc9c5O4HHBlXGKiq/7G3Vj6aqIKHOINMYoh/7Efygx+l FRrTJMsdysFD2VKmU0ePMr9B3l+UPkNHSy6iL5tsbqDzc5zkV7Nqdl2ia4TNUs7E4Aww irc2pPOTufySrEaNmJ3ZxgNQLPIemV6QLTfQA=
2009/9/17 WANG Xuewen <xuewen.wang@xxxxxxxxx>:
> I have no problem to change NMaxTinyMatrixDimension to a pair (for example
> to 6 or 8) and change to use Aligned so can possibly get the vectorization.
> I've done a quick test and it seems it did improve ( I'll do a bit more test
> to be sure).
OK, let's say right away that this still won't give you very good
vectorization. First, out of the 3 vectorization strategies in
Assign.h, only 2 will be applicable here. Second, since the size is
dynamic, they will have to do some runtime logic that pays of for
large enough matrices, but perhaps not at such small sizes.
So you can always try but it's not guaranteed to be faster, and may
even be slower (if the runtime logic is too costly to pay for itself).
In that sense, your DontAlign makes sense.
> Even if the dynamic dimension is not necessarily the same as 8 or 6, for
> example, if it is 3, will I be able to gain from vectorization?
A packet of 'double' contains 2 doubles. If the size is 3, only 1
packet can fit inside, and there remains 1 double, so vectorization
allows to do only 2 operations instead of 3 (for vector ops).
I seriously doubt that that is worth it! i expect vectorization to
make things slower there.
vectorization works well when:
- either the sizes are large (>10)
- or they are known at compile time and multiples of the packet size
>>> 2. Even if the storage is in the stack, but the dimension is not known
>>> at compile time. Will the loop get unrolled for most operations?
>> No, no loop will be unrolled at all, since that requires knowing the
>> exact size at compile time.
> Is it possible to extend Eigen2 to allow unroll the loop? We did it for the
> home made library with meta programming, but I'm not sure if it is possible
> to do it with Eigen2.
That's not possible to do without causing an overhead. If you want to
unroll here, you have to perform the operation on the whole allocated
array, not just the matrix. That requires initializing this array with
0's, otherwise you're slowed down by inf/nan values; and filling the
array with 0's has a cost. In Eigen, we don't initialize the arrays at
all, so creating a matrix on the stack has no cost at all.
Also, more issues appear if one wants to vectorize such operations.
> I may take a look if you tell me that it is possible
> and give me hint on how. Or will vectorization be more interesting anyway?
I'd rather point you to another direction: are you really sure that
the size is not known at compile time?
Since you're dealing with such small sizes, there aren't many
combinations anyway. Perhaps you could get a list of all the sizes
that are actually used, perhaps it's not that long. In that case it
may be a good idea to use completely fixed-size matrices. That means
that a lot more code will be instantiated, but that may be worth it if
you're after performance! You would then have if()'s in your code to
select between the code paths optimized for separate sizes.
>> M = N;
>> M.diagonal().cwise() += scalar;
>> (don't remember, perhaps this cwise()+= requires #include<Eigen/Array>)
> This doesn't work with 2.0.5 (doesn't compile, I tried it from the
Here it works also with 2.0.5, are you sure that you #include<Eigen/Array>?
What's your compiler error?