Re: [hatari-devel] New option --disable-video and --benchmark

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Le 26/02/2017 à 13:37, Eero Tamminen a écrit :
Hi,

On 02/26/2017 12:59 AM, Nicolas Pomarède wrote:
Le 25/02/2017 à 23:34, Eero Tamminen a écrit :
not necessarily, some music disk will just start without any user
interaction, so video is not needed if you know you just need to press
up/down arrow later to change song for exemple.

Ok, but one needs to know the software pretty well
beforehand.  Sounds a bit of a corner case. I'm still
in favor of dropping the <bool> and and not having
it in config...

I mean, if it doesn't give clear performance advantage over
existing options, video output being disabled is just a huge
disadvantage.  That makes it *really* marginal use-case.

To validate the --disable-video performance advantage, have
you compared performance e.g. with following options:
    --borders off --frameskips 4 -z 1 --spec512 0
?


Hi

see the values I posted earlier, disabling the SDL calls *does* impact performance I go from 200 VBL/s to 275 VBL/s, that's clear performance advantage. Also, I don't want to disable accuracy by removing borders emulation or disabling spec512 mode, that's not my goal.

I want to compare different rendering method for cpu/video when all parameters are set to the max (CE mode for CPU, full shifter emulation and so on). So disabling border or spec512 or using zoom x1 instead of x2 is not an option.



<aside>
When you analyze CPU usage, there are few things that you need
to take into account:
....
</aside>

I know all this, and I know linux enough to repeat succesive runs of Hatari in similar conditions from an OS point of view to tell that the numbers I get in the end are comparable.


I proceeded to measured Hatari CPU usage for GEM desktop idling,
with borders disabled (and fast-forward obviously disabled), for
following setups:
* SDL_VIDEODRIVER=dummy
* --disable-video on
* --frameskips 0
* --frameskips 4

And while there was ~0.5% variation from run-to-run, there
was no real difference in Hatari CPU usage between above
options, with SDL2 build of Hatari.

As to VBL counts with:
  --benchmark --run-vbls 2400 --machine st --tos tos104.img --borders
off --fast-forward on

I do get measurable differences:
* 474 VBL/s --disable-video
* 460 VBL/s --disable-video --frameskips 0
* 465 VBL/s SDL_VIDEODRIVER=dummy --frameskips 0
* 426 VBL/s --frameskips 0 --borders on
* 444 VBL/s --frameskips 0
* 468 VBL/s --frameskips 4
* 474 VBL/s --frameskips 4 -z 1
* 477 VBL/s SDL_VIDEODRIVER=dummy --frameskips 4 -z 1

-> --disable-video doesn't provide any performance
   advantage over previously available options and if
   user has disabled the default auto frame-skipping,
   it's actually slower than enabling frame-skipping.


benchmarking on an idle gem doesn't look like the most accurate way to measure all aspects of emulation. I do my tests on some demos where screen changes completely on each frame, so all lines will be converted/rendered on each frame.


With SDL1 Hatari build (and EmuTOS), the CPU usage results were
quite unexpected:
* 8.5% SDL_VIDEODRIVER=dummy --frameskips 0
* 7.5% --disable-video on --frameskips 0
* 7.0% --frameskips 0
* 6.8% --frameskips 4

-> Disabling video completely getting worse results may be some
kind of kernel power management anomaly, related to whether
things are interactive (poll something), or not.

As to system CPU usage, all the differences were within variation
(which was several percentages), so it was impossible to do any
conclusions on that.

SDL1 is not my target for benchmark, if performance are worse/counter intuitive when disabling SDL calls under SDL1, then I don't know, I'm using SDL2 anyway.


Conclusions:

* At least on my setup, --disable-video brings no performance
  advantages, just huge usability disadvantage. Better option is
  to use options already mentioned in Hatari manual performance
  section and use EmuTOS

* Surprisingly, video updates don't seem to have much impact on
  CPU usage, unlike ~10 years ago on Nokia's mobile ARM devices,
  which were the reason why I originally added frameskip support.
  I guess things are better optimized on SDL & HW side nowadays,
  and Hatari's CPU emulation takes larger proportion of CPU than
  decade ago

My metrics is not to measure cpu usage at normal execution speed, because it's very difficult to tell the difference between 10 or 15% cpu usage. That's why I do the benchmark at full CPU speed (no VBL wait on the host side), then in the end you have a number of VBL/sec that is more accurate to compare things (ie make the host CPU runs as fast as possible for emulation, then compare the time it took to generate the same number of frames)



disabling video can make hatari similar to running sc68 (which emulates
only audio, not video), so you can just start hatari with an sndh player
and a sound file, no need for video or interaction with Hatari

According to its www-site:
    http://sc68.atari.org/project.html

It's a plugin to GUI music players, so usability & usage wise
it's IMHO not comparable.

sc68 is not a plugin, it's a full cpu/ym2149/mfp emulator that plays music files, usable from the CLI. Some plugins exist for WinAMP or some linux player, but they use the core sc68 library. sc68 can be used without any GUI.

Once again, if you run Hatari without any UI, then you loose all keyboard input. With the UI and --disable-video, you could run sndh player and change subsong with the corresponding key. Or you can run the B.I.G. demo, just press up/down/enter and change music.


I'd like to have all keys shortcuts while Hatari is running ; having
just the window created at start but not refreshed later gives a quick
solution to this : shortcuts still work, but SDL calls like Copy/Render
don't interfere with the benchmarking.

Having benchmarks that require manual intervention (besides
interrupting them if there's some problem), sounds really
something that should be fixed before doing benchmarking.

I didn't say it required it, just that I want to be able to press F12 during the benchmark if needed or press pause for example.

It can replace the redundant --disable-video. :-)

No, it can't. dummy driver clearly doesn't work in my case. Maybe we have different HW setup, maybe it depends on the renderer used by SDL2 (GL, SW, ...), but in my case dummy gain is marginal (5-10 %), while disabling SDL calls gives 35%.

On another more powerful PC I have, running the same demo with --benchmark and --disable-video 0 or 1 goes from 950 VBL/s to 1600 VBL/s, so nearly 60% increase by disabling SDL calls for Copy/Render.

To summarize :
--disable-video solve my problem of disabling as much SDL/OS calls as possible, which driver=dummy doesn't do (at least on several PC I have acces to) --benchmark does the same as fast-forward (ie don't wait for VBL on host side) *for now*, but considering I might add different behaviour later, I'd rather have a separate option since the beginning (and this way, if I break --benchmark, at least it won't break fast-forward)

Nicolas







Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/