[hatari-devel] Hatari profiling question (was: Accelerating blitting on TT by code re-arrangement (on Emutos-devel))

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Eero Tamminen schrieb:

On 7.2.2021 10.54, Christian Zietz wrote:
That's what I used for my initial analysis in this case, too. Although,
tbh, I'm not fully sure how to interpret the results. I suppose some
instructions are counted with zero cycles because they're fully absorbed
by surrounding instructions? But how can an instruction that is executed
1 million times be responsible for 2 million instruction cache misses?

Can you provide example profiler disassembly?

(Maybe on the Hatari mailing list?)

From my EmuTOS VDI profiling:

$00e21794 : adda.w    d2,a0       2.79% (1178064, 2361790, 1419, 0)
$00e21796 : adda.w    d3,a1       2.79% (1178064, 2353471, 38, 0)
$00e21798 : move.l    d1,d0       2.79% (1178064, 7068967, 1178050, 0)
$00e2179a : move.w    (a0),d0     2.79% (1178064, 10689954, 190, 577010)
$00e2179c : swap      d0          2.79% (1178064, 7069362, 1178150, 0)
$00e2179e : move.l    d0,d1       2.79% (1178064, 76, 0, 0)
$00e217a0 : rol.l     d4,d0       2.79% (1178064, 76, 0, 0)
$00e217a2 : jmp       (a2)        2.79% (1178064, 14139013, 2356248, 0)

Like I said, I'm not sure how to interpret the I-cache misses,
particularly in the last line. Is it because it's a JMP and both the
cache miss while fetching the instruction as well as the cache miss
while fetching the jump target count towards the number? Or is it
because the cache misses for the instruction *preceding* the JMP (ROL.L,
shown with 0 I-cache misses) are counted towards the JMP instruction?

Regards
Christian
--
Christian Zietz  -  CHZ-Soft  -  czietz@xxxxxxx
WWW: https://www.chzsoft.de/
PGP/GnuPG-Key-ID: 0x52CB97F66DA025CA / 0x6DA025CA



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/