Re: [hatari-devel] DSP performance

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]

To: hatari-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [hatari-devel] DSP performance
From: Nicolas Pomarède <npomarede@xxxxxxxxxxxx>
Date: Wed, 01 Jul 2015 11:37:10 +0200

Le 01/07/2015 11:17, Adam Klobukowski a écrit :

IMO, it can. You know, in advance, how many cycles it will take to
emulate instruction, so you know when it 'happens'. knowing microcode
(or internal CPU layout) would be perfect, but not really necessary, you
just need to 'measure' when instruction tries to access the bus. It
wouldn't be much slower, bacause there would b a lot of empty cycles,
and computation of some instructions would 'smear' on many cycles. This
could be also a good base for otherwise tricky emulation of border
remowal, palette tricks, sync scrolling and so on.

the problem we actually have with falcon in Hatari is that we *don'tknow in advance* the instructions' cycles (at least not all of them arecorrect)

If you consider the case of STF + 68000 emulation, where instructionscycles are known, Hatari is capable since many years to run all thetricky shifter effects without having a master clock that runs eachcomponent (cpu, dma, fdc, ...) for 2 cycles on every bus slot.

So, the problem is not of choosing if the cpu is the main clock or if wehave another clock that masters everything, the problem is to knowexactly the number of cycles of each instruction ; knowing the microcodeis a plus because you will know exactly when accesses are made and thiscan solve some tricky timing issues (this is also available in 68000 CEmode at the moment, but not for 68020/30).

The problem can be turned in any way : no correct cycle for eachinstruction will imply speed difference for cpu and between cpu and dsp.This is not a problem of choosing a reference clock.

In the case of the STF + 68000, since we know the microcode, we couldrun each instruction by splitting it in 2 cycles, sthg like :


 while ( 1 )
  {
    run_cpu ( 2 );
    run_fdc ( 2 );
    run_shifter ( 2 );
    run_ste_dma_sound ( 2 );
    ...
    add_cycle_master_clock ( 2 );
  }

But when you really look at the cases you will have to handle, you seeit's inefficient to always split each cpu instructions in 2 cycles.For example, splitting a DIVU or a MOVEM that take 100 cycles would be aknightmare to handle, because you would need to save the internalcontext of each instruction to stop it and to restore it 2 cycles later.

It's more efficient to run each instruction as a whole, but to updateall other component each time the microcode of the instruction do a busaccess for example (this is how WinUAE works in CE mode for Amiga :during the emulation of 1 complete instruction, it will update copper,blitter, fdc, ...)


Nicolas

Follow-Ups:
- Re: [hatari-devel] DSP performance
  - From: Adam Klobukowski

References:
- Re: [hatari-devel] DSP performance
  - From: Nicolas Pomarède
- Re: [hatari-devel] DSP performance
  - From: laurent . sallafranque
- Re: [hatari-devel] DSP performance
  - From: Adam Klobukowski
- Re: [hatari-devel] DSP performance
  - From: Nicolas Pomarède
- Re: [hatari-devel] DSP performance
  - From: Adam Klobukowski

Messages sorted by: [ date | thread ]
Prev by Date: Re: [hatari-devel] DSP performance
Next by Date: Re: [hatari-devel] DSP performance
Previous by thread: Re: [hatari-devel] DSP performance
Next by thread: Re: [hatari-devel] DSP performance

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/