|Re: [hatari-devel] DSP performance|
[ Thread Index |
| More lists.tuxfamily.org/hatari-devel Archives
For references, here are Centurbo bench results:
Today WinUAE CPU core: FPU 812 Mhz, DSP 57 Mhz, CPU 27 Mhz
Dec WinUAE CPU core: FPU 153 Mhz, DSP 57 Mhz, CPU 172 Mhz
Old WinUAE CPU core: FPU 221 Mhz, DSP 32 Mhz, CPU 15 Mhz
Old UAE CPU core: FPU 403 Mhz, DSP 32 Mhz, CPU 78 Mhz
(moving mouse drops FPU and CPU numbers slightly)
I.e. compared to old WinUAE core with Laurent's timings,
current CPU runs 80%, DSP 78% and FPU >3x faster.
DSP timing being derived from CPU would explain why it's
faster by same amount as CPU, but why FPU is now so much
On maanantai 29 kesäkuu 2015, Laurent Sallafranque wrote:
> I believe the current way is the good one, as long as we manage to set
> the correct timings to each instruction.
> The static table was of course a better/worse approach, not a "exact
> timings" solution.
> As far as I know, we'll never reach the exact cycles for some
> instructions like mul or div, but if we can approach the correct timings
> for the current ones in all cpu modes (mmu, cycle exact, prefetch, ...)
> it would be a big step.
> Let's forget the table approach (I've kept a hatari 1.7 on my hatd drive
> to compare with future versions, so no need to keep it in the current
> Le 29/06/2015 00:03, Nicolas Pomarède a écrit :
> > Le 28/06/2015 23:47, Laurent Sallafranque a écrit :
> >> Hi all,
> >> Until now, I've always thought that the first fight for Falcon
> >> emulation was the accuracy of the CPU cycles, as the cpu is THE clock
> >> for all the system.
> >> When I did the static cycles table in the previous version of hatari
> >> (until 1.8), I did recompute the whole table for 16 bits memory acces
> >> and for .w or .l instructions (cycles are different due to the cycle
> >> access).
> >> Maybe the current 68030 cycles are for a 32 bit computer (as the amiga
> >> 68020 is) and the cycles are not recomputed for a 16 bits BUS. As the
> >> cpu core is issued from winUAE, it may be something like that.
> >> Maybe there's something else to search ;)
> >> I know my static table was not perfect at all, but it seemed to give a
> >> not so far timing accuracy from a real falcon.
> >> I spent more than 1 month recomputing the figures according to Mikro's
> >> documentation about 68030 cycles in the Falcon.
> >> I don't know where the cycles are computed in the new engine (I should
> >> take the time to have a closer look at this).
> > Hi
> > in WinUAE CPU (as in old UAE CPU), cycles are computed not with a
> > table but with some basic sets of "rules" that combine the time needed
> > to prefetch, to access memory, to do bit operations, arithmetic and so
> > on, taking into account the operand size.
> > On the average, it's possible the table gave better results, or better
> > results for the instructions that are most commonly used on Falcon
> > when cycle accuracy is needed.
> > But it didn't take instr/data cache into account by using the real
> > logic as in a 68030 (it was some worst/best case values)
> > And in the end, it was too difficult to merge new WinUAE cpu with this
> > table, there're too much differences in both approaches.
> > Also, I have the feeling the table was based on 68020, not 68030 ? For
> > example, NOP took 2 cycles in cache and 4 cycles with no cache, but
> > 68030 doc says it's always 2 cycles (same for EXG dx,dy, timings are
> > different between cache and no cache, but it should not be the case).
> > All in all, I have no "fit all" solution. Keeping an old version of
> > WinUAE CPU core was not good, as many 68000 CE behaviour are handled
> > much more cleanly/accurately in latest WinUAE ; and this also fixed
> > several games/demos for Falcon that didn't work before.
> > By comparing real Falcon cycles result with latest Hatari, we will be
> > able to spot some differences and this could give hint on how some
> > instructions should work internally to reach the correct number of
> > cycles, but until then, we will have a mix of improvements/regressions.
> > Nicolas