Re: [eigen] news on the refactoring of the expression template mechanism |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen <eigen@xxxxxxxxxxxxxxxxxxx>*Subject*: Re: [eigen] news on the refactoring of the expression template mechanism*From*: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>*Date*: Fri, 21 Feb 2014 23:49:28 +0100*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=xMtSu/rr73+OSbaCjlAk8JZSFGCX7jRbc3GXxp7ZPQs=; b=d0cGff+H8XuZO5mXy1+VWSlF9hpEAyAu4qgjHZeMX9z0JmDCawh9CQVZcEkLxCa4pE BpJm4l6MaMdnj9TX2bFtOEJwNSmEP1YnRPICgrjKmaheZZ+WhAFWVtiHfHiA1EtJHQJK UCyc6qruXI9QJuJc7hn/Nk7FhWpXuV0B1kDMmqE3gmsoSV5q57OKyBQoOqUg74foL+yh Z3G2uv25xrgTVJkY4ynHFMONbYX0yOOfr04lOYMTFMeCNqIFiKkJBkBKkV1QEQI6/rDn GY7/WIlSeHZdKlO/pLkoN+7E2iDmmtwpuaH2L1m4y6v89c3jZPoBj0+moED6/PkT7mUJ 2WUg==

On Fri, Feb 21, 2014 at 5:48 PM, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote: > Hi, > > very nice work! > > I did not walk through the source so far, so (some of) my > questions/suggestions/comments might actually be trivial. > > > About compound assignment operators (p14): > Will this be extendable/specializable for FMA-Support? This could save > operations for array expressions such as A+=B*C; (and we already have a > pmadd packet function). This could be done by specializing: Assignment<Dst,CwiseBinaryOp<scalar_prod_op,B,C>, add_assign_op, Dense2Dense> as well as: evaluator<CwiseBinaryOp<scalar_prod_op,A,CwiseBinaryOp<scalar_prod_op,B,C>>> for A+B*C. > Regarding Temporaries (all pseudo-codes should be extendable to > packet-code): > > p12: [ (a+b)*c ] > Assume a, b are JxN and c is NxK with very large N but small J and K, > especially min(J,K)==1. I think storing (a+b) to a temporary could/should > theoretically be avoided (pseudo-code): >> >> for(int i=0; i<N; ++i) { >> Result += (a+b).col(i) * c.row(i); // each element of (a+b) and c is >> accessed only once at the cost of accessing Result multiple times >> } This is already the case if c is a vector. If a and b are row vectors but c is a matrix, then a+b is evaluated into a temp. In theory this could not be needed, but we have to find a tradeoff between the genericity of the product kernels and their number of instantiation... Product kernels are heavy weighted. We could evaluate (a+b) per small chunks though... > p20: [ (A+B).normalized() ] > Theoretically, the norm could be accumulated while the expression is > evaluated into the temporary, saving one walk through the vector. > Furthermore, for this example, usually the result vector could be used to > store the temporary (saving cache-accesses). > Pseudo-code: >> >> norm2 = 0; >> for(int i=0; i<N: ++i) { >> temp = A(i)+B(i) >> res(i) = temp; >> norm2 += temp*temp; >> } >> normInv = 1/sqrt(norm2) >> for(int i=0; i<N; ++i) { >> res(i) *= normInv; >> } This is typically something that could be possible with the new generic "kernels", but I don't known if that's worth the effort. > Finally, regarding vectorization of (partially) unaligned objects: > I think it would be nice, if we could somehow determine the coeff-read/write > cost and the packet-read/write cost of the src and dst and decide at > compile-time which path is more efficient (c.f. bug 256). I'm aware that's a > bit wishful thinking, since for many expressions the costs are hard to > determine and they also strongly depend on the target architecture (if we > trust Agner's instruction timings [1] there is no difference between aligned > and unaligned loads/stores anymore on IvyBridge/SandyBridge/Haswell, > compared to a factor 2 or 4 difference on Wolfdale/Nehalem). yes that's a planed feature, and actually a needed feature when we'll have determine which packet size is best when multiple sizes are possibles. > Nothing of the above is a mandatory optimization at the moment, but it would > be nice if all/most of them will be implementable without major > refactorings. Sure! gael

**References**:**[eigen] news on the refactoring of the expression template mechanism***From:*Gael Guennebaud

**Re: [eigen] news on the refactoring of the expression template mechanism***From:*Christoph Hertzberg

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] news on the refactoring of the expression template mechanism** - Next by Date:
**[eigen] Re: Release 3.2.1?** - Previous by thread:
**Re: [eigen] news on the refactoring of the expression template mechanism** - Next by thread:
**Re: [eigen] news on the refactoring of the expression template mechanism**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |