[eigen] Re: about .lazy() |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: [eigen] Re: about .lazy()
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Sat, 15 Aug 2009 11:54:11 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=Ywl3XAEZbfXkrfj/Bxn3fAo/ESzAS+BjF5UDSbtpVI4=; b=FlEC2/xtfYIKt/sMwXgYukbvorrnb+bqmbn3xoe1izFzz0XTk2wCeCcIFTMR5O2E+j G/2T/mfmQGw58zjXMdrm0LPoifDeu0b7/NgL6IsETycwnvecXAveCKUUE39s2E2jZjl8 LvpW7EdOipdnWCDojbDmntfyfgQs3ZE9U8DE8=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=GiN5bn1lfBT3knTVagdGViVwUkfHTupOnlo1UU1qoo2f72/gLY/MjuldKrBW3/B8+U EJV0minf2CxT99QYVHHKPHE0XVtXZ6oIkwYMwPoLm45gHZo8cb8vZy7xC7kizQ7aFJFM +A1xezwut2XSDOJlYrUzNIwr0m1f6FhqdiUh4=
I forgot to say that to keep the compatibility with the current
solution, it is enough to update .lazy() to only remove the
"MayAliasBit" flag.
gael.
On Fri, Aug 14, 2009 at 11:36 PM, Gael
Guennebaud<gael.guennebaud@xxxxxxxxx> wrote:
> Hi all,
>
> some of you already noticed that the current devel branch might look
> broken because, e.g.:
>
> D = C + (A*B).lazy();
>
> does or does not compile according to the size of the matrices... I
> know this is not a very nice situation, and my suggestion to solve
> this mess is to remove the .lazy() function. Here are some arguments
> against ".lazy()":
>
> a - it is generic concept, but it only makes sense for product expressions
>
> b - it is quite difficult to fully understand, and so it's difficult
> to use it well
>
> c - it covers two different features at once:
>
> c1 - it means that the result does not alias with the operands of
> the product, but for that purpose it makes more sense to control that
> via a special operator=, like res.noalias() = ...
>
> c2 - it also means that the product should not be evaluated
> immediately, but evaluated as a standard expression. However, in
> practice it is (almost) never a good idea to do so, and when it is not
> the case the speed difference is negligible.
>
> For large matrices, my last statement is obviously true. So if you
> wonder what happens for small matrices, here is a benchmark for small
> fixed and dynamic sizes matrices which evaluates D = C + A*B; using
> three different strategies:
>
> "Eval" : D = C + (A*B).eval();
>
> "Lazy" : D = C + (A*B).lazy(); // here lazy means both "eval as an
> expression" and "no-alias"
>
> "Optimal": (D = C) += (A*B).lazy(); // here lazy is only used to means
> "no-alias"
>
> Here are the results with Eigen 2.0 (in MFlops, higher is better):
>
> size fix+e fix+l fix+o dyn+e dyn+l dyn+o
> 2 1134 1501 1415 137 250 131
> 3 2442 1672 1469 283 401 267
> 4 5473 3495 5033 652 945 630
> 5 2359 1763 1567 586 697 580
> 6 1889 1765 1772 836 977 828
> 7 2110 1821 1643 815 881 792
> 8 3143 3286 3140 1247 1366 1213
> 9 2412 1827 1715 874 881 859
> 10 1931 1850 1832 1159 1198 1137
> 11 2451 1859 1792 1040 1035 1030
> 12 2876 3082 2943 1494 1431 1464
> 13 2453 1825 1759 1153 1130 1136
> 14 1903 1789 1813 1388 1380 1398
> 15 2422 1787 1717 1236 1226 1213
> 16 3055 3126 3077 3709 1574 4077
> 17 2319 1735 1710 2316 1258 2408
>
>
> and with the devel branch:
>
> size fix+e fix+l fix+o dyn+e dyn+l dyn+o
> 2 1073 0 1541 71 0 105
> 3 2457 0 1532 179 0 250
> 4 4849 0 4141 452 0 626
> 5 2149 0 1508 580 0 781
> 6 2423 0 1676 796 0 1031
> 7 1860 0 1778 919 0 1169
> 8 2283 0 2407 1708 0 2291
> 9 1938 0 2155 1819 0 2248
> 10 2247 0 2343 1923 0 2254
> 11 1775 0 2084 1888 0 2144
> 12 3341 0 3625 3047 0 3580
> 13 2718 0 3245 2827 0 3305
> 14 3202 0 3306 2885 0 3184
> 15 2672 0 3096 2794 0 2748
> 16 4870 0 5135 4342 0 4890
> 17 3560 0 4264 3914 0 4428
>
> Let's recall that with the devel branch the "lazy" solution does not
> compile, whence the zeros...
>
> As you see, overall the devel branch is faster that is a good news
> because we did not put any effort to optimize small matrices further.
> The second remark is that the "lazy" solution performs poorly, even
> for very small matrices. The reasons are two folds. First, evaluating
> the product to a temporary will allow to vectorize the addition.
> Second, for small fixed size objects temporaries are put on the stack,
> and therefore they cost nothing.
>
> So to summary, I'd be in favor in removing .lazy(), replace the
> EvalBeforeAssignBit flag by a MightAliasBit flag, and add a no-alias
> mechanism on the result side.
>
>
> Finally, there is also the question whether operator+= and -= should
> be "no-alias" by default ? This because I think that in 99% of the
> case, when you write:
>
> m += <product>
>
> it's very unlikely that m is one of the operand of the product. The
> drawback is that might be confusing for the user, (because operator=
> and operator+= would behave differently wrt aliasing).
>
>
> any opinions or better solutions ?
>
> cheers,
> Gael.
>
--
Gaël Guennebaud
Iparla - INRIA Bordeaux
(+33)5 40 00 37 95