[hatari-devel] Your advice is needed for 68030 - DSP synchro

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

I start to have better synchronisations between CPU and DSP in cycle exact mode.
(I'll have to finish to use the 68030 cycle table soon).

While testing rot3Dbmp (I'm pretty sure of the timings of this program), I noticed that it was not working.
It needs 4 CPU cycles more (8 DSP cycles)

The 68030 syncs the first time with the DSP, then runs into the following code for all the screen.

cpu video_cyc=   158 158@  0 : 0005CA7A 4e71                     NOP.L
cpu video_cyc=   160 160@  0 : 0005CA7C 4e71                     NOP.L
cpu video_cyc=   162 162@  0 : 0005CA7E 4e71                     NOP.L
cpu video_cyc= 164 164@ 0 : 0005CA80 30d1 MOVE.W (A1),(A0)+
cpu video_cyc=   172 158@  0 : 0005CA7A 4e71                     NOP.L
cpu video_cyc=   174 160@  0 : 0005CA7C 4e71                     NOP.L
cpu video_cyc=   176 162@  0 : 0005CA7E 4e71                     NOP.L
cpu video_cyc= 178 164@ 0 : 0005CA80 30d1 MOVE.W (A1),(A0)+
....


From what I read somewhere, the Host interface of the DSP is reached in bytes by the 68030.
A move.w value, Host_interface needs 2 access to the host interface.

To access the Host interface, from iomemtabFalcon, we call DSP_HandleWriteAccess or DSP_HandleReadAccess in dsp.c. These 2 functions slice the 68030 call to the DSP host interface memory in bytes calls (via a "for" loop) , but doesn't consume any cycle for multi bytes access.

I've added the following code in the 2 functions:
(I've included handle_read for the example, but the change is the same for handleWrite).


@@ -647,10 +648,14 @@
 {
     Uint32 addr;
     Uint8 value;
+    Uint8 cmpt = 0;

for (addr = IoAccessBaseAddress; addr < IoAccessBaseAddress+nIoMemAccessSize; addr++)
     {
 #if ENABLE_DSP_EMU
         value = dsp_core_read_host(addr-DSP_HW_OFFSET);
+        if (cmpt > 0)
+            M68000_AddCycles(4);
 #else
         /* this value prevents TOS from hanging in the DSP init code */
         value = 0xff;
@@ -658,6 +663,7 @@

Dprintf(("HWget_b(0x%08x)=0x%02x at 0x%08x\n", addr, value, m68k_getpc()));
         IoMem_WriteByte(addr, value);
+        cmpt ++;
     }
 }


I considere that if the call to the Host Interface is superior to 1 byte, each byte after the first one adds 4 cycles to the cpu (so 8 cycles to the DSP).

(With this change, rot3dBMP, bound2 and bound3 runs correctly without having to add +2 cycles to each DSP instructions as I did before).


The new cycles are :

cpu video_cyc=   158 158@  0 : 0005CA7A 4e71                     NOP.L
cpu video_cyc=   160 160@  0 : 0005CA7C 4e71                     NOP.L
cpu video_cyc=   162 162@  0 : 0005CA7E 4e71                     NOP.L
cpu video_cyc= 164 164@ 0 : 0005CA80 30d1 MOVE.W (A1),(A0)+
cpu video_cyc=   174 158@  0 : 0005CA7A 4e71                     NOP.L
cpu video_cyc=   176 160@  0 : 0005CA7C 4e71                     NOP.L
cpu video_cyc=   178 162@  0 : 0005CA7E 4e71                     NOP.L
cpu video_cyc= 180 164@ 0 : 0005CA80 30d1 MOVE.W (A1),(A0)+
....

You can notice that the move.w takes 4 more cycles now, because of the 2 byte access


Do you think this is the good way to do it ?


Regards

Laurent




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/