Re: [hatari-devel] DSP performance |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/hatari-devel Archives
]
Le 26/06/2015 15:03, Douglas Little a écrit :
Hi,
Maybe timings were right before in CE mode by luck when data caching
was not enabled, and now that it's enabled they're not good anymore ?
If you compile your own Hatari, could you try to force a "return
false" in function cancache030() around line 7607 cpu/newcpu.c . Do
you get better values then wrt DSP speed ?
Ok I can give it a try. I was only able to build SDL1 versions in the
past under Cygwin so hopefully that is still possible?
yes it should, I tested some days ago.
However I suspect d-cache changes will have no meaningful impact, based
on what I can see so far.
- the code which waits on DSP in the first test case (the game) is a
host-port status spinloop. The cost for these spin instructions was
never accurate vs real HW, and the new timing hasn't changed much from
what I can see. Not by 70% for sure. It's within 10% of previous versions.
note that MFP works the same as DSP : if 68030 cpu cycles are not
correct, then the duration of an MFP timer (if you convert it into a
number of milli seconds) will not be correct either.
So, you can't have a reference delay by any mean in the emulated machine
if some cycles have too much difference with real HW.
- The code which waits on the DSP in the second test case (DSPBENCH) is
based on MFP events. i.e. the waiting time is dictated by something
other than the CPU. If the CPU cycles costs have increased, it will just
execute fewer CPU cycles during the test. I *think* this is why DSPBENCH
reported correct results previously (IIRC to within a decimal point)
even if the CPU timings were never perfect.
- The performance gain measured on the DSP side should vary a lot
depending on the CPU side instructions which are running concurrently. I
don't see that happening - it's pretty much fixed (maybe some variation,
I'm not sure - but it seems to remain close to 70% when calculating back).
- The host port status/data registers (which execute in the spinloop,
while timing the DSP) are not data-cacheable. They are volatile-mapped
HW memory. If it was cacheable, the software would lose coherency with
HW and quickly crash. I can't be sure that introducing the d-cache
support is unrelated, but in real terms disabling the cache should have
no effect on that test.
So taking these into account, I believe the change has something to do
with DSP clocks relative to the MFP or the master timer - and not in
relation to the CPU at all. There are too many clues beginning to point
there I think. The MFP-based timing seems the most concrete of those.
There was no change in MFP lately. Many STF demos rely on precise MFP
timings to remove top border, and they still work. If sthg was broken
with MFP for 68030, it would affect 68000 mode too.
No idea so far :( Let's see what you get when compiling the suggested
change.