Re: [hatari-devel] profiler cycles (was New WinUAE core issue with Bad Mood)

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


10-20% of the instructions have zero cycles, in places even more.


That's what I detected when I added the new winuae core 2-3 years ago.
That's why I patched the code to add a big table of 68030 timings (with and without caches), all recomputed for 16 bits adressing memory (according to motorola's doc). But I think that the big table is probably not the best way to do it (I'll have to give a closer look at this new version (just after I finish a little new feature for the DSP).

Laurent


Le 14/12/2014 23:16, Eero Tamminen a écrit :
Hi,

On perjantai 12 joulukuu 2014, Nicolas Pomarède wrote:
Le 12/12/2014 10:22, Douglas Little a écrit :
It is possible for an opcode to take 0 real cycles on 68030, depending
on how it is being 'measured'.

Most ops have a fixed execution time but they can overlap to a small
degree with one another (a complete occlusion is rare) and with
effective address calculation and memory writes.

But if you're seeing 0-cycle opcodes in bulk it is much more likely to
be a bug :)
10-20% of the instructions have zero cycles, in places even more.
There can be even 3 consecutive instructions with zero cycles.  [1]

This happens only for Falcon & TT emulation, and only with WinUAE
CPU emulation.


yes, I agree that with the head/tail cycles in 68020, some overlap can
and will occur, but I doubt it's the case here, that's why I think it
would be better the have the trace of one run through the reported piece
of code than to have only the sum/average.
I commited your patch and enabled debug output for successive
instructions with zero cycles.

Tto see the issue, just boot TOS v3 or TOS v4 using:
	--trace cpu_disasm --parse profile.ini

with "profile.ini" containing following line:
---
profile on
---



	- Eero

[1]
cpu video_cyc=103372 460@201 : 00E0FBD2 2a6d 0010                MOVEA.L
(A5, $0010) == $00004da8,A5
cpu video_cyc=103374 462@201 : 00E0FBD6 4e95                     JSR (A5)
cpu video_cyc=103386 474@201 : 00E12A0E 4bf9 00ff 8a3c           LEA.L
$00ff8a3c,A5
cpu video_cyc=103386 474@201 : 00E12A14 720f                     MOVE.L
#$0000000f,D1
cpu video_cyc=103386 474@201 : 00E12A16 3a02                     MOVE.W
D2,D5
WARNING: Zero cycles for successive opcodes:
$e12a0e : 4bf9 00ff 8a3c                       lea       $ff8a3c,a5
0.00% (1, 0, 0)
$e12a14 : 720f                                 moveq     #$f,d1
0.00% (1, 0, 0)
$e12a16 : 3a02                                 move.w    d2,d5
0.00% (1, 0, 0)

and:

cpu video_cyc=102800 400@200 : 00E094EA 226f 0002                MOVEA.L
(A7, $0002) == $00008862,A1
cpu video_cyc=102804 404@200 : 00E094EE 3411                     MOVE.W
(A1),D2
cpu video_cyc=102806 406@200 : 00E094F0 c47c 0fff                AND.W
#$0fff,D2
cpu video_cyc=102806 406@200 : 00E094F4 5489                     ADDA.L
#$00000002,A1
WARNING: Zero cycles for successive opcodes:
$e094f0 : c47c 0fff                            and.w     #$fff,d2
0.00% (1, 0, 0)
$e094f4 : 5489                                 addq.l    #2,a1
0.00% (1, 0, 0)
cpu video_cyc=102806 406@200 : 00E094F6 2f49 0002                MOVE.L A1,
(A7, $0002) == $00008862
cpu video_cyc=102818 418@200 : 00E094FA b47c 000f                CMP.W
#$000f,D2
cpu video_cyc=102818 418@200 : 00E094FE 6210                     BHI.B
#$00000010 == $00e09510 (F)
cpu video_cyc=102818 418@200 : 00E09500 e54a                     LSL.W
#$00000002,D2
WARNING: Zero cycles for successive opcodes:
$e094fa : b47c 000f                            cmp.w     #$f,d2
0.00% (1, 0, 0)
$e094fe : 6210                                 bhi.s     $e09510
0.00% (1, 0, 0)
$e09500 : e54a                                 lsl.w     #2,d2
0.00% (1, 0, 0)

and:

cpu video_cyc=103526 102@202 : 00E12AAC cbee ffda                MULS.W (A6,
-$0026) == $000087da,D5
cpu video_cyc=103532 108@202 : 00E12AB0 da81                     ADD.L D1,D5
cpu video_cyc=103532 108@202 : 00E12AB2 43f3 5800                LEA.L (A3,
D5.L*1, $00) == $003f8340,A1
cpu video_cyc=103532 108@202 : 00E12AB6 7208                     MOVE.L
#$00000008,D1
WARNING: Zero cycles for successive opcodes:
$e12ab0 : da81                                 add.l     d1,d5
0.00% (1, 0, 0)
$e12ab2 : 43f3 5800                            lea       (a3,d5.l),a1
0.00% (1, 0, 0)
$e12ab6 : 7208                                 moveq     #8,d1
0.00% (1, 0, 0)

and:

cpu video_cyc=103708 284@202 : 00E12BB8 e24b                     LSR.W
#$00000001,D3
cpu video_cyc=103714 290@202 : 00E12BBA dd46                     ADDX.W
D6,D6
cpu video_cyc=103714 290@202 : 00E12BBC e24a                     LSR.W
#$00000001,D2
cpu video_cyc=103714 290@202 : 00E12BBE dd46                     ADDX.W
D6,D6
WARNING: Zero cycles for successive opcodes:
$e12bba : dd46                                 addx.w    d6,d6
0.00% (1, 0, 0)
$e12bbc : e24a                                 lsr.w     #1,d2
0.00% (1, 0, 0)
$e12bbe : dd46                                 addx.w    d6,d6
0.00% (1, 0, 0)

and:

cpu video_cyc=105970 136@215 : 00E12BCC 4ad5                     TAS.B (A5)
cpu video_cyc=105974 140@215 : 00E12BCE 4e71                     NOP
cpu video_cyc=105974 140@215 : 00E12BD0 6bfa                     BMI.B
#$fffffffa == $00e12bcc (F)
cpu video_cyc=105978 144@215 : 00E12BD2 2c0c                     MOVE.L
A4,D6
cpu video_cyc=105978 144@215 : 00E12BD4 671c                     BEQ.B
#$0000001c == $00e12bf2 (T)
cpu video_cyc=105978 144@215 : 00E12BF2 51cd ffb6                DBF .W
D5,#$ffb6 == $00e12baa (F)
WARNING: Zero cycles for successive opcodes:
$e12bd2 : 2c0c                                 move.l    a4,d6
0.00% (1, 0, 0)
$e12bd4 : 671c                                 beq.s     $e12bf2
0.00% (1, 0, 0)
$e12bf2 : 51cd ffb6                            dbra      d5,$e12baa
0.00% (2, 24, 0)
cpu video_cyc=105978 144@215 : 00E12BF6 4e7a 5002                MOVEC
CACR,D5

and:

cpu video_cyc=106492 146@217 : 00E00DA6 301f                     MOVE.W
(A7)+,D0
cpu video_cyc=106498 152@217 : 00E00DA8 b058                     CMP.W
(A0)+,D0
cpu video_cyc=106500 154@217 : 00E00DAA 6c16                     BGE.B
#$00000016 == $00e00dc2 (F)
cpu video_cyc=106500 154@217 : 00E00DAC 3200                     MOVE.W
D0,D1
cpu video_cyc=106500 154@217 : 00E00DAE e549                     LSL.W
#$00000002,D1
WARNING: Zero cycles for successive opcodes:
$e00daa : 6c16                                 bge.s     $e00dc2
0.00% (16, 0, 0)
$e00dac : 3200                                 move.w    d0,d1
0.00% (16, 56, 0)
$e00dae : e549                                 lsl.w     #2,d1
0.00% (16, 0, 0)






Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/