Re: [hatari-devel] profiler cycles (was New WinUAE core issue with Bad Mood)

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

On perjantai 12 joulukuu 2014, Nicolas Pomarède wrote:
> Le 12/12/2014 10:22, Douglas Little a écrit :
> > It is possible for an opcode to take 0 real cycles on 68030, depending
> > on how it is being 'measured'.
> > 
> > Most ops have a fixed execution time but they can overlap to a small
> > degree with one another (a complete occlusion is rare) and with
> > effective address calculation and memory writes.
> > 
> > But if you're seeing 0-cycle opcodes in bulk it is much more likely to
> > be a bug :)

10-20% of the instructions have zero cycles, in places even more.
There can be even 3 consecutive instructions with zero cycles.  [1]

This happens only for Falcon & TT emulation, and only with WinUAE
CPU emulation.


> yes, I agree that with the head/tail cycles in 68020, some overlap can
> and will occur, but I doubt it's the case here, that's why I think it
> would be better the have the trace of one run through the reported piece
> of code than to have only the sum/average.

I commited your patch and enabled debug output for successive
instructions with zero cycles.

Tto see the issue, just boot TOS v3 or TOS v4 using:
	--trace cpu_disasm --parse profile.ini

with "profile.ini" containing following line:
---
profile on
---



	- Eero

[1]
cpu video_cyc=103372 460@201 : 00E0FBD2 2a6d 0010                MOVEA.L 
(A5, $0010) == $00004da8,A5
cpu video_cyc=103374 462@201 : 00E0FBD6 4e95                     JSR (A5)
cpu video_cyc=103386 474@201 : 00E12A0E 4bf9 00ff 8a3c           LEA.L 
$00ff8a3c,A5
cpu video_cyc=103386 474@201 : 00E12A14 720f                     MOVE.L 
#$0000000f,D1
cpu video_cyc=103386 474@201 : 00E12A16 3a02                     MOVE.W 
D2,D5
WARNING: Zero cycles for successive opcodes:
$e12a0e : 4bf9 00ff 8a3c                       lea       $ff8a3c,a5                 
0.00% (1, 0, 0)
$e12a14 : 720f                                 moveq     #$f,d1                     
0.00% (1, 0, 0)
$e12a16 : 3a02                                 move.w    d2,d5                      
0.00% (1, 0, 0)

and:

cpu video_cyc=102800 400@200 : 00E094EA 226f 0002                MOVEA.L 
(A7, $0002) == $00008862,A1
cpu video_cyc=102804 404@200 : 00E094EE 3411                     MOVE.W 
(A1),D2
cpu video_cyc=102806 406@200 : 00E094F0 c47c 0fff                AND.W 
#$0fff,D2
cpu video_cyc=102806 406@200 : 00E094F4 5489                     ADDA.L 
#$00000002,A1
WARNING: Zero cycles for successive opcodes:
$e094f0 : c47c 0fff                            and.w     #$fff,d2                   
0.00% (1, 0, 0)
$e094f4 : 5489                                 addq.l    #2,a1                      
0.00% (1, 0, 0)
cpu video_cyc=102806 406@200 : 00E094F6 2f49 0002                MOVE.L A1,
(A7, $0002) == $00008862
cpu video_cyc=102818 418@200 : 00E094FA b47c 000f                CMP.W 
#$000f,D2
cpu video_cyc=102818 418@200 : 00E094FE 6210                     BHI.B 
#$00000010 == $00e09510 (F)
cpu video_cyc=102818 418@200 : 00E09500 e54a                     LSL.W 
#$00000002,D2
WARNING: Zero cycles for successive opcodes:
$e094fa : b47c 000f                            cmp.w     #$f,d2                     
0.00% (1, 0, 0)
$e094fe : 6210                                 bhi.s     $e09510                    
0.00% (1, 0, 0)
$e09500 : e54a                                 lsl.w     #2,d2                      
0.00% (1, 0, 0)

and:

cpu video_cyc=103526 102@202 : 00E12AAC cbee ffda                MULS.W (A6, 
-$0026) == $000087da,D5
cpu video_cyc=103532 108@202 : 00E12AB0 da81                     ADD.L D1,D5
cpu video_cyc=103532 108@202 : 00E12AB2 43f3 5800                LEA.L (A3, 
D5.L*1, $00) == $003f8340,A1
cpu video_cyc=103532 108@202 : 00E12AB6 7208                     MOVE.L 
#$00000008,D1
WARNING: Zero cycles for successive opcodes:
$e12ab0 : da81                                 add.l     d1,d5                      
0.00% (1, 0, 0)
$e12ab2 : 43f3 5800                            lea       (a3,d5.l),a1               
0.00% (1, 0, 0)
$e12ab6 : 7208                                 moveq     #8,d1                      
0.00% (1, 0, 0)

and:

cpu video_cyc=103708 284@202 : 00E12BB8 e24b                     LSR.W 
#$00000001,D3
cpu video_cyc=103714 290@202 : 00E12BBA dd46                     ADDX.W 
D6,D6
cpu video_cyc=103714 290@202 : 00E12BBC e24a                     LSR.W 
#$00000001,D2
cpu video_cyc=103714 290@202 : 00E12BBE dd46                     ADDX.W 
D6,D6
WARNING: Zero cycles for successive opcodes:
$e12bba : dd46                                 addx.w    d6,d6                      
0.00% (1, 0, 0)
$e12bbc : e24a                                 lsr.w     #1,d2                      
0.00% (1, 0, 0)
$e12bbe : dd46                                 addx.w    d6,d6                      
0.00% (1, 0, 0)

and:

cpu video_cyc=105970 136@215 : 00E12BCC 4ad5                     TAS.B (A5)
cpu video_cyc=105974 140@215 : 00E12BCE 4e71                     NOP 
cpu video_cyc=105974 140@215 : 00E12BD0 6bfa                     BMI.B 
#$fffffffa == $00e12bcc (F)
cpu video_cyc=105978 144@215 : 00E12BD2 2c0c                     MOVE.L 
A4,D6
cpu video_cyc=105978 144@215 : 00E12BD4 671c                     BEQ.B 
#$0000001c == $00e12bf2 (T)
cpu video_cyc=105978 144@215 : 00E12BF2 51cd ffb6                DBF .W 
D5,#$ffb6 == $00e12baa (F)
WARNING: Zero cycles for successive opcodes:
$e12bd2 : 2c0c                                 move.l    a4,d6                      
0.00% (1, 0, 0)
$e12bd4 : 671c                                 beq.s     $e12bf2                    
0.00% (1, 0, 0)
$e12bf2 : 51cd ffb6                            dbra      d5,$e12baa                 
0.00% (2, 24, 0)
cpu video_cyc=105978 144@215 : 00E12BF6 4e7a 5002                MOVEC 
CACR,D5

and:

cpu video_cyc=106492 146@217 : 00E00DA6 301f                     MOVE.W 
(A7)+,D0
cpu video_cyc=106498 152@217 : 00E00DA8 b058                     CMP.W 
(A0)+,D0
cpu video_cyc=106500 154@217 : 00E00DAA 6c16                     BGE.B 
#$00000016 == $00e00dc2 (F)
cpu video_cyc=106500 154@217 : 00E00DAC 3200                     MOVE.W 
D0,D1
cpu video_cyc=106500 154@217 : 00E00DAE e549                     LSL.W 
#$00000002,D1
WARNING: Zero cycles for successive opcodes:
$e00daa : 6c16                                 bge.s     $e00dc2                    
0.00% (16, 0, 0)
$e00dac : 3200                                 move.w    d0,d1                      
0.00% (16, 56, 0)
$e00dae : e549                                 lsl.w     #2,d1                      
0.00% (16, 0, 0)



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/