|Re: [eigen] FLENS C++ expression template Library has excellent documentation|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] FLENS C++ expression template Library has excellent documentation
- From: Christian Mayer <mail@xxxxxxxxxxxxxxxxx>
- Date: Sat, 18 Apr 2009 17:43:54 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :reply-to:user-agent:mime-version:to:subject:references:in-reply-to :x-enigmail-version:content-type:content-transfer-encoding; bh=pGWSSJYReZBCptkTh7LdLgtbcD6B+0PmSPL2jZywNLE=; b=lrc0/vhxzNTfw2pefWLZnm4xXrIa8mRT3/PsHVeHlL3+zs9yQID/gMflEqyvzUq5j8 IZsms4eM4N/62P9woIT2zUYQT7BveQsCSOpMciDTSGeRmlaFPyYY3esIpn4Mekf09+6i U5QjfU5W0BDau+9tgajzgrlR5AF3KJQQA0ymE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:reply-to:user-agent:mime-version:to :subject:references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=XPQsUVng5U1st2AlP6CDW7vHpZEwGCOuJz2Uz4jcSbn8HF7WGJGZewUqK71eTCtVVj 8C6LVgSOjIqNlDEMYnHiVRBneYrvim/qvxMa8bJc4oJiupQoc9BI9LCn5EJQ2GEuk2AR 1KOWT3icprjFC1ylxSn7NDWUC/wAi16oD0Ugs=
-----BEGIN PGP SIGNED MESSAGE-----
Rohit Garg schrieb:
>> So, in which area does Intel MKL still have a long-term lead? I would
>> say parallelization. We haven't started that yet and it is probably a
>> very, very tough one. It's what I have in mind when I say that a
>> BLAS/LAPACK wrapper is still welcome.
> Why do you think parallelization is very difficult? Do you mean
> parallelization nfrastructure? AFAICS, using openmp will be cool. Let
> compiler handle all the dirty buisness etc This is something I want to
> explore (time availability is of course important !) so I would like
> some heads up.
(disclaimer: I haven't done much parallelization in the past, but I
listened with much interest a few lectures and different discussions on
Do NOT use OpenMP in our case!
OpenMP is great to parallelize a few loops in old code where you can't
spend the time to do it right. If you've got the choice you should
allways rethink every algorithm and implement it in a parallel way (with
the threading lib of your choice)
You've got much more control and you can think ahead.
In the case of EIGEN with the expression templates we have a very strong
base that can support such an approach. As the compiler knows the
calculations ahead it could parallelize some calcutations. E.g. look at
E = A*B + C*D
You could do it the dumb way to use a parallel A*B, then a parallel C*D
and at the end a parallel E = prod1 + prod2. That's what OpenMP could
But wouldn't it be much wiser to run A*B in one thread, C*D in another
and a E = prod1 + prod2? That's much better with data location...
But chaning that code to use a very lightweight sceduler and different
tasks with minimal locking and still best data locality is hard work.
But it'll be woth it - for the big matrix case (I *guess* small, fixed
matrices won't benefit at the EIGEN level of parallelsation. There the
bigger algorithms have to take care of it).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
-----END PGP SIGNATURE-----