Re: [hatari-devel] Suspicious instruction & data cache hit/miss accounting

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

On 02/01/2018 01:27 AM, Nicolas Pomarède wrote:
Le 01/02/2018 à 00:19, Eero Tamminen a écrit :
On 02/01/2018 12:36 AM, Nicolas Pomarède wrote:
Le 31/01/2018 à 23:22, Eero Tamminen a écrit :
[...]>> I.e. how/why most of instructions have *neither* cache hits or misses?

Based on our WinUAE code, it seems that instruction cache hits & misses
aren't counted for pipelined instructions, only for instructions
requiring cache fill, either because there's a branch or prefetch.

While accounting can be fixed later e.g. to mark anything not causing
misses as having a hit, I wonder whether the miss stats are correct.
Any comments?

(And as you can see, some of the executed instructions can incur
multiple i-cache hits or misses.)


I'm not sure why same happens for data cache.  I think hits & misses
are skipped in the CPU emulation core only if data cache is disabled.

However, above case is EmuTOS idling in TOS desktop.  I don't think
it would be toggling data cache on & off...

Would it be possible that largest part of memory accesses are done
through accessors that actually don't go through data cache checks?


it could be that I forgot to count hit/miss in some places.

Do you have a way to run the same test by customising your traces output to show disasm + hit/miss counters after each instruction ? This way we could see when counters are not incremented which would give a hint on where values where not counted.

The cpu can also makes a difference as they use different cache system. I guess it 68030 ?

After you enable profiling ("profile on"):

* "profile caches" command shows then the overall statistics for
   all executed instructions since emulation was last continued.

* disassembly shows in addition to instruction counts & cycles,
   the i-cache misses and d-cache hit counts, accumulated for
   each instruction address.

Cache info is based on the CpuInstruction.(I|D)_Cache_(hit|miss)
counters provided by the CPU core.  Profiling code zeroes
them after each executed instruction.

If you enable these during the test you did with idle tos, do you get any disasm traces that would tell some example of instructions / addresses which were not correctly counted in cache access ?

Disassembly shows only i-cache misses and d-cache hits, so
from that you don't know whether something is missing.


However, for i-cache, I think it's clear from the CPU core sources
that they're counted only for instructions that trigger either
prefetch or pipeline stall (=branch).

Do you agree on that interpretation?  Because then:
* Those hit/miss counts also tell how often those events happens
* It should be fine to translate (on the profiler side) any
  instruction that doesn't generate a miss, as being a hit.
  Wouldn't it?


What I don't understand for i-cache, is how you can get multiple
hits or misses for single instruction.  Instructions are all
word sized & word aligned, so they cannot cross cache line
boundary, so there should be only zero or one hit / miss,
shouldn't there?

And what about data cache?  I can understand 2 misses if
data is e.g. long crossing cache line, but what about larger
numbers?  Or is it about how much data the miss caused to
be fetched to the cache?


	- Eero



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/