Re: [hatari-devel] DSP performance |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/hatari-devel Archives
]
Hi,
For references, here are Centurbo bench results:
Today WinUAE CPU core: FPU 812 Mhz, DSP 57 Mhz, CPU 27 Mhz
Dec WinUAE CPU core: FPU 153 Mhz, DSP 57 Mhz, CPU 172 Mhz
Old WinUAE CPU core: FPU 221 Mhz, DSP 32 Mhz, CPU 15 Mhz
Old UAE CPU core: FPU 403 Mhz, DSP 32 Mhz, CPU 78 Mhz
(moving mouse drops FPU and CPU numbers slightly)
I.e. compared to old WinUAE core with Laurent's timings,
current CPU runs 80%, DSP 78% and FPU >3x faster.
DSP timing being derived from CPU would explain why it's
faster by same amount as CPU, but why FPU is now so much
faster?
- Eero
On maanantai 29 kesäkuu 2015, Laurent Sallafranque wrote:
> I believe the current way is the good one, as long as we manage to set
> the correct timings to each instruction.
> The static table was of course a better/worse approach, not a "exact
> timings" solution.
>
> As far as I know, we'll never reach the exact cycles for some
> instructions like mul or div, but if we can approach the correct timings
> for the current ones in all cpu modes (mmu, cycle exact, prefetch, ...)
> it would be a big step.
>
> Let's forget the table approach (I've kept a hatari 1.7 on my hatd drive
> to compare with future versions, so no need to keep it in the current
> versions).
>
> Laurent
>
> Le 29/06/2015 00:03, Nicolas Pomarède a écrit :
> > Le 28/06/2015 23:47, Laurent Sallafranque a écrit :
> >> Hi all,
> >>
> >> Until now, I've always thought that the first fight for Falcon
> >> emulation was the accuracy of the CPU cycles, as the cpu is THE clock
> >> for all the system.
> >>
> >> When I did the static cycles table in the previous version of hatari
> >> (until 1.8), I did recompute the whole table for 16 bits memory acces
> >> and for .w or .l instructions (cycles are different due to the cycle
> >> access).
> >>
> >> Maybe the current 68030 cycles are for a 32 bit computer (as the amiga
> >> 68020 is) and the cycles are not recomputed for a 16 bits BUS. As the
> >> cpu core is issued from winUAE, it may be something like that.
> >> Maybe there's something else to search ;)
> >>
> >> I know my static table was not perfect at all, but it seemed to give a
> >> not so far timing accuracy from a real falcon.
> >> I spent more than 1 month recomputing the figures according to Mikro's
> >> documentation about 68030 cycles in the Falcon.
> >>
> >> I don't know where the cycles are computed in the new engine (I should
> >> take the time to have a closer look at this).
> >
> > Hi
> >
> > in WinUAE CPU (as in old UAE CPU), cycles are computed not with a
> > table but with some basic sets of "rules" that combine the time needed
> > to prefetch, to access memory, to do bit operations, arithmetic and so
> > on, taking into account the operand size.
> >
> > On the average, it's possible the table gave better results, or better
> > results for the instructions that are most commonly used on Falcon
> > when cycle accuracy is needed.
> >
> > But it didn't take instr/data cache into account by using the real
> > logic as in a 68030 (it was some worst/best case values)
> >
> > And in the end, it was too difficult to merge new WinUAE cpu with this
> > table, there're too much differences in both approaches.
> >
> > Also, I have the feeling the table was based on 68020, not 68030 ? For
> > example, NOP took 2 cycles in cache and 4 cycles with no cache, but
> > 68030 doc says it's always 2 cycles (same for EXG dx,dy, timings are
> > different between cache and no cache, but it should not be the case).
> >
> > All in all, I have no "fit all" solution. Keeping an old version of
> > WinUAE CPU core was not good, as many 68000 CE behaviour are handled
> > much more cleanly/accurately in latest WinUAE ; and this also fixed
> > several games/demos for Falcon that didn't work before.
> >
> > By comparing real Falcon cycles result with latest Hatari, we will be
> > able to spot some differences and this could give hint on how some
> > instructions should work internally to reach the correct number of
> > cycles, but until then, we will have a mix of improvements/regressions.
> >
> > Nicolas