|Re: [hatari-devel] Suspicious instruction & data cache hit/miss accounting|
[ Thread Index |
| More lists.tuxfamily.org/hatari-devel Archives
Here's example disassembly from EmuTOS 0.9.9.1 on Falcon emu.
Instructions which have either zero instruction cache hits & misses,
or zero data cache hits & misses, are marked with '*':
$00e4cf4e: d1ee fff6 adda.l $fff6(a6),a0 (185, 2640, 185, 190)
$00e4cf52: 2d48 fff6 move.l a0,$fff6(a6) * (185, 2685, 185, 0)
$00e4cf56: 4444 neg.w d4 * (185, 0, 0, 0)
$00e4cf58: 3d44 fff0 move.w d4,$fff0(a6) * (185, 1665, 185, 0)
$00e4cf5c: 3c00 move.w d0,d6 * (185, 1110, 185, 0)
$00e4cf5e: 0246 000f andi.w #$f,d6 * (185, 1128, 188, 0)
$00e4cf62: 3d46 ffe6 move.w d6,$ffe6(a6) * (185, 2036, 185, 0)
$00e4cf66: 5349 subq.w #1,a1 * (185, 0, 0, 0)
$00e4cf68: 3239 0000 22a0 move.w $22a0,d1 (185, 3330, 370, 0)
$00e4cf6e: 0240 fff0 andi.w #$fff0,d0 * (185, 1110, 185, 0)
$00e4cf72: 45f9 00e0 b9be lea $e0b9be,a2 * (185, 1110, 185, 0)
$00e4cf78: 1432 2800 move.b (a2,d2.l),d2 (185, 2590, 185, 0)
$00e4cf7c: 4882 ext.w d2 * (185, 1110, 185, 0)
$00e4cf7e: e468 lsr.w d2,d0 * (185, 0, 0, 0)
$00e4cf80: 0280 0000 ffff andi.l #$ffff,d0 * (185, 2220, 370, 0)
$00e4cf86: d0b9 0000 044e add.l $44e,d0 (185, 4062, 185, 1)
$00e4cf8c: 3409 move.w a1,d2 * (185, 1110, 185, 0)
$00e4cf8e: c4c1 mulu.w d1,d2 * (185, 3885, 0, 0)
$00e4cf90: 2240 movea.l d0,a1 * (185, 1110, 185, 0)
$00e4cf92: d3c2 adda.l d2,a1 * (185, 0, 0, 0)
$00e4cf94: 2d49 fff2 move.l a1,$fff2(a6) (184, 2760, 184, 0)
As you can see, they're the majority (as indicated by
the profiler cache hit/miss histogram).
If you want more output, I pushed commit that shows the info
after you set "DEBUG" to 1 in profilecpu.c, re-build Hatari,
start Falcon or TT emulation, and enable profiling:
It's common enough that you see it immediately, regardless
of what you run and on what 030 TOS version.
On 02/02/2018 11:41 AM, Nicolas Pomarède wrote:
Le 02/02/2018 à 00:18, Eero Tamminen a écrit :
Disassembly shows only i-cache misses and d-cache hits, so
from that you don't know whether something is missing.
but if you add your own printf after disasm to print all hit/miss
counters after each instructions ?
However, for i-cache, I think it's clear from the CPU core sources
that they're counted only for instructions that trigger either
prefetch or pipeline stall (=branch).
Do you agree on that interpretation? Because then:
* Those hit/miss counts also tell how often those events happens
* It should be fine to translate (on the profiler side) any
instruction that doesn't generate a miss, as being a hit.
What I don't understand for i-cache, is how you can get multiple
hits or misses for single instruction. Instructions are all
word sized & word aligned, so they cannot cross cache line
boundary, so there should be only zero or one hit / miss,
And what about data cache? I can understand 2 misses if
data is e.g. long crossing cache line, but what about larger
numbers? Or is it about how much data the miss caused to
be fetched to the cache?
A movem could generate several cache misses.
But it's hard to conclude anything without any real opcode example.
Maybe some are perfectly normal, maybe for some I forgot to count some
hit/miss, hard to tell without actual instructions leading to these