Re: [eigen] Automatic cache and block size determination |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
In data lunedì 21 giugno 2010 23:53:09, Gael Guennebaud ha scritto:
> You can test it by going into eigen/bench and runnnig:
>
> g++ -DNDEBUG -O2 -lrt bench_gemm.cpp -I .. -o gemm && ./gemm
>
> At the beginning it reports the detected L1 and L2/L3 cache sizes and
> -1 if cpuid is not supported. Currently I've tested with core2
> processors only, so it would be great if some could try it on other
> architectures and report failures (-1 and/or wrong cache sizes).
Great work :)
I think it works well here, here's the output of gemm + the first cpu (out of 4 identical) in cpuinfo. This all seems coherent.
I guess there is a very small runtime cost of 'checking if the cache sizes have already been computed' for every product, right ?
And also, computation involving cache sizes were previously done at compile time and not anymore.. ? Nothing to worry about ?
orzel@berlioz hg/eigen/bench% g++ -DNDEBUG -O2 -lrt bench_gemm.cpp -I .. -o gemm && ./gemm
L1 cache size = 65536 KB
L2/L3 cache size = 512 KB
Matrix sizes = 2048x2048 * 2048x2048
blocking size = 8 x 2048
eigen cpu 4.33855s 3.95982 GFLOPS (8.82266s)
eigen real 4.34305s 3.95571 GFLOPS (8.84252s)
orzel@berlioz hg/eigen/bench% cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 16
model : 5
model name : AMD Athlon(tm) II X4 620 Processor
stepping : 2
cpu MHz : 2600.000
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
bogomips : 5223.92
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
--
Thomas Capricelli <orzel@xxxxxxxxxxxxxxx>
http://www.freehackers.org/thomas