On 02/14/2018 12:56 AM, Thorsten Otto wrote:
On Dienstag, 13. Februar 2018 23:46:27 CET Nicolas Pomarède wrote:
So, during this same second, your host PC will need to emulate twice as
much instruction, which take make more work.
I still don't understand why Hatari is so slow in that emulation. For
comparison, Aranym uses a similar cpu core for the emulation (yes i
know its a
much older version, maybe more comparable to Hatari old uae cpu core).
Closest comparison point would be Hatari's old UAE CPU core
before WinUAE CPU core support was added to Hatari.
In comparison to Hatari, it does not do any cycle counting, so
basically it is
running in fast-forward mode all the time. In this mode, on an I7
cpu,without
JIT, it is able to run about 10 times as fast as a 32Mhz Falcon. So
essentially, only 10% of the cpu power is needed to achieve the speed of
32Mhz. I cannot believe that Hatari needs 90% of the hosts cpu power
to count
cycles.
Well, based on the Callgrind callgraph I mailed, 30% of executed
instructions goes just to emulating cache prefill (which happens
approximately every other emulated 16Mhz 030 instruction).
And it's not just "counting" cycles. It's a design changed.
Everything needs to be split into smaller parts so that different
parts of the system can be synched more frequently, for accuracy.
Clock counting & interrupt handling needs to be changed for that etc.
(In more lightweight ST/STE emulation, emulating all the interrupts
takes a noticeable amount of CPU, *even* with patched Timer-D option,
if one is e.g. playing music inside the emulation.)
I think Aranym's DSP emulation is in separate thread, but it's
lacking so much features & cycle accuracy, that it's not compatible
with any Falcon DSP SW that I know of (although some non-DSP Falcon
SW does work with Aranym).
Aranym emulates 040 instead 030. 040 is in some respects simpler
CPU than 040.
Then there are all the HW quirks that need to be emulated.
Just take a look at the video.c...
- Eero