Re: [hatari-devel] Suspicious instruction & data cache hit/miss accounting

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

On 02/06/2018 03:29 PM, Nicolas Pomarède wrote:
[...]> I wanted to add details to my latest mail, but as you guessed it, the
differences you see are indeed mostly due to prefetch / pipeline inside the 68020/30.

For the details, see "11.2.2 instruction pipe" in the 68030 user manual doc.

Basically, the cpu has an internal 32 bit reg named "cache holding register" CAHR. This reg is used to fill the internal stages A, B, C and D of the cpu.

One of the difference with the 68000, is that this reg is 32 bits, while on 68000 it's 16 bit. So, on 68000, you have at least a mem access during every instruction to keep this 16 bit prefetch reg filled.

On the 68030, you need to refill when the 2 words of the cache hold reg were pushed to stage A.

So, if we take the example of a flow of instructions where each instruction would be 2 bytes (eg "adda.l d2,a2", "movea.l (a3),a3"), you can see that if the CAHR was filled just before, then you can get 1 word without doing an external mem access, and without even doing an i-cache access.

Imagine a flow of 100 NOP (1 word each), then you will get 1 access to the i-cache every 2 instructions (it could be a hit or a miss). Every other 2 instruction, you get a "free" access to the opcode.

On the contrary, when you have an instruction involving a branch, CAHR must be refilled at the new PC, and you will need to access cache/external mem to do so (so, hit or miss counter will increase)

Note that in the end, it doesn't necessary means that the code will be faster (it depends on the RAM speed), this just explains the flow of memory access. If your RAM is not capable of 32 bit access (so called fast ram), refilling CAHR will take 2 word accesses, instead of 1 long word access.

In the case of i-cache counter in the profiler, maybe you can add a 3rd cases to hit or miss like "prefetch", when hit/miss counter were both 0 for current instruction.

I added that statistic, but now I started to wonder about terminology,
what would be best understood by the users of the profiler.

Is "prefetch" correct name for moving data from i-cache to CAHR
register, or do people normally interpret "prefetch" to mean reading
instructions from system RAM to i-cache?

I.e. maybe the zero i-cache hit/miss case should be named as:
  "Cache holding register (CAHR) already refilled from i-cache"

instead of my current "Already prefetched" name?


	- Eero




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/