- If it's a 68030/40/60, you should use I_Cache_miss, I_Cache_hit,
D_Cache_miss, D_Cache_hit.
Unused values should remain zero, so it shouldn't be a problem to
output both instruction & data cache values, right?
So, a read for 32 bits could yield :
- 1 hit
- or 1 misses
- or 1 hit and 1 miss
- or 2 hits
- or 2 misses
I'm seeing more instruction cache misses per instruction,
upto 6 misses per instruction, just from TOS4 desktop boot,
and going over desktop menus.
WARNING: 6 CPU instruction cache misses > 5 at 0xe00c9a:
$00e1c3b8 : 4e73 rte
0.00% (7, 248, 17, 0)
$00e00c9a : 3f00 move.w d0,-(sp)
0.00% (1, 8, 0, 0)
WARNING: 6 CPU instruction cache misses > 5 at 0xe03288:
$00e1c236 : 4eb9 00e0 946a jsr $e0946a
0.30% (294183, 9415132, 1176836, 0)
$00e03288 : 48e7 f0f0 movem.l d0-d3/a0-a3,-(sp)
0.00% (401, 14708, 324, 0)
Interestingly, above happens only without MMU. With MMU, maximum
number of i-cache misses per instruction is 4 for same use-case.
(Previous Hatari WinUAE core had only up to 3 i-cache misses per
instruction.)
Of course, if cache is disabled, you get 0 hit and 0 miss.
Also, in order to not slow down the main cpu loop in newcpu.c, it's the
external profiler *that must clear the hit/miss cache counter*. This
way, counters will be cleared only when needed, no need to clear them
every time in newcpu.c if the profiler is not used anyway.
This is fine (and I noticed you already made this change).
Same clearing needs to be done also when profiling is
(re-)initialized.