On Tue, Aug 28, 2012 at 5:14 PM, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote: > On 28.08.2012 16:39, Gael Guennebaud wrote: >> >> On Tue, Aug 28, 2012 at 4:24 PM, Rhys Ulerich <rhys.ulerich@xxxxxxxxx> >> wrote: >>> >>> [...] >> >> >> Partly. Blaze is also able to exploit AVX instruction (so a >> theoretical x2 compared to SSE), however Blaze suffers from a HUGE >> shortcoming: data are assumed to be perfectly aligned, including >> inside a matrix. For instance, a row-major matrix is padded with empty >> space such that each row is aligned. The extra space *must* be filled >> with zeros and nothing else because they are exploited during some >> computations... As a consequence, Blaze does not has any notion of >> sub-matrices or the like, cannot map external data, etc. One cannot >> even write a LU decomposition on top of Blaze. In other word, it is >> not usable at all. This is also unfair regarding the benchmarks >> because they are comparing Blaze with perfectly aligned matrices to >> Eigen or MKL with packed and unaligned matrices. I also realized that blaze::trans(A) * v is 5 times slower than A * v => trans(A) seems to be evaluated into a temporary. > Actually, it would be nice to optionally support padding in Eigen as well, > sth like padding to the next multiply of packet-size or so. > This would also have the benefit that e.g. col(i) expressions can actually > return an aligned map. yes, for small objects that can offer a real speedup. We could either extend Matrix, in which I'd rather add an option bit flag, or even add a new class. >>> 2) Blaze shows faster performance on A*B*v for A and B matrices >>> because they don't honor order of operations and their expression >>> templates treat it as A*(B*v). This is moot as I can simply write >>> A*(B*v) in Eigen. >> >> >> Exactly, though we plane to do the same (and more) once the evaluators >> finalized. > > > Hm, I wouldn't think it's a good idea to optimize this automatically without > warnings/special flags. I can't think of an example right now where it would > harm (of course, if v is a matrix as well, it's easy to find examples where > the first is better than the second), but I would like to be able to > distinguish between (A*B)*v and A*(B*v). That's the usual remark, and the usual answer is: (A*B).parenthesis() * v; (or .group(), or whatever...) Also note that any optimized math library like Eigen is already putting arbitrary parenthesis. For instance a simple v.sum() with vectorization is not computed as: v0 + v1 + v2 + v3 + ... but as (v0 + v2 + v4 + ...) + (v1 + v3 + ...) or even: ((v0+v1) + (v2+v3)) + ((v4+v5) + (v6+v7)) when unrolling occurs. Similar arbitrary parenthesis occur inside matrix products, so implicitly computing A*(B*v) instead of (A*B)*v should exceptionally be an issue. > And a completely different note, has anyone checked how they do their > "automatic alias detection"? If, as Gael says, they don't support > sub-matrices, do they just compare the data pointers, or can they actually > decide it at compile time somehow? > I wouldn't see how they can do it with methods like (Eigen syntax): > > void aliastest(VectorXd& res, const VectorXd &x){ > res = {some complex expression with x}; > } no clue. gael > > Christoph > > -- > ---------------------------------------------- > Dipl.-Inf. Christoph Hertzberg > Cartesium 0.049 > Universität Bremen > Enrique-Schmidt-Straße 5 > 28359 Bremen > > Tel: +49 (421) 218-64252 > ---------------------------------------------- > >

