|[eigen] Automatic cache and block size determination|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen <eigen@xxxxxxxxxxxxxxxxxxx>
- Subject: [eigen] Automatic cache and block size determination
- From: Gael Guennebaud <gael.guennebaud@xxxxxxxxx>
- Date: Mon, 21 Jun 2010 23:53:09 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:from:date :message-id:subject:to:content-type; bh=J0r4QYAInwCufQ942p//yzJFrGCo9Rx9+RygbbMN7wU=; b=As8NuncMmFbdhOdWaFmBmFlJGFHwFbX1/GKroQDI/gq1o5EpjcoxZifr0XIp18xpWH gl4+eVPkOmkjP0EIgXB3svdOxykNypYOUcLLewWTp2Hv5y1Y/7RPwMiWwBeTEKqsPqbH XDS6piP+phpnocP0TqokTw0qCGE6NjAqn//dE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=lNv/u8YC5fw/6i24crC5Ji3JLJ6BoGFqqeFyEzK9OBBwlYZ8ViatEiYBG3MTTbjX9g T9+Jkr/Xu8Ck89PUF20QQAXAH5LmdCItZ6k/XdQGECjOksI4sIARbuPMlrJBBwGdOIyR 2J39zp3sm84rsqM6oCSCILqs7MwsPziSX0rHc=
thanks to Thomas's cpuid piece of code, I managed to query the L1 and
L2 cache sizes. So now, L1 and L2 caching sizes are automatically
determined at runtime when the first matrix product is called. I've
also significantly simplified and optimized the way the blocking
parameters are computed from the L1 and L2 cache sizes.
Of course, for the architectures having three level of caches (e.g.,
nehalem), it uses the shared L3, ignoring the small L2 cache.
You can test it by going into eigen/bench and runnnig:
g++ -DNDEBUG -O2 -lrt bench_gemm.cpp -I .. -o gemm && ./gemm
At the beginning it reports the detected L1 and L2/L3 cache sizes and
-1 if cpuid is not supported. Currently I've tested with core2
processors only, so it would be great if some could try it on other
architectures and report failures (-1 and/or wrong cache sizes).
Thanks a lot.