Re: [hatari-devel] optimizing for speed (was: Beats of Rage, new Falcon game)

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

(Sorry, this is a bit off-topic except that debugging these things
is nowadays quite nice with Hatari with next/step commands & symbols
and they could be good test-cases for Hatari profiler.)

On sunnuntai 16 joulukuu 2012, Eero Tamminen wrote:
> On sunnuntai 16 joulukuu 2012, George Nakos wrote:
> > And even so, we're using the packer with the
> > fastest  observed  unpacking  time,  Ray's  lz77. We tried packing our
> > first game with Pack-Ice, which resulted the main game requiring about
> > 15 seconds to unpack and of course raised some criticism.
> 
> I wrote small wrapper around minilzo:
> http://www.oberhumer.com/opensource/lzo/download/minilzo-2.06.tar.gz

Btw. if one doesn't mind using full liblzo, the examples
included with liblzo are much better / more comprehensive than
the ones with minilzo.


> And build it with GCC 2.5.8 from sparemint using "-O2
> -fomit-frame-pointer" (-O3 didn't give better speed).
> 
> For "maxima.bin" from the Laurent's game, it compressed the 1076 kB
> file to 600kB when using 64k buffer with the lower LZO compression
> level that minilzo supports.
> 
> Under Hatari TT emulation, uncompressing that from one file to another
> file (over Hatari GEMDOS HD emulation) took 2.6s.  Hatari disk overhead
> seems to be really small compared to compression effort with 64kB buffer,
> so I do conversions to disk, to be able to verify them.
> 
> Using "safe" LZO uncompress function which does extra buffer overflow
> checks, the time grew to 3.3s, a 26% increase, whereas running
> the uncompressing under ST emulation multiplied the time by 4x
> compared to TT emulation.
> 
> 
> Compiling minilzop with VBCC using -O3, gives 2.34s decompression
> speed (0.46MB/sec), but compression time with it is 17s instead of 9s
> I got with gcc v2.8.5 built binary.  It would be interesting if somebody
> would build the code with latest GCC from Vincent, using -O3 and see
> whether it improves.
> 
> 
> Ray's lz77.c [1] compresses that file better, to 532kB, but compression
> is *really* slow and decompression takes 11.1s, which is >4x slower than
> LZO C-version.
> 
> [1] Code from here from which I have remove screen output and added
> 	timing code:
> 		http://files.dhs.nu/files_coding/lz77.zip
> 
> VBCC -O3 compiled version is much slower at decompression, 25s, but
> I think that is mostly because lz277.c uses putc() & getc() and VBCC
> stdlib probably is doing less (or no) buffering on FILE* operations
> than what MiNTlib used by GCC does.
> 
> Is there some better version of lz77 C-code which doesn't do separate
> file IO on each byte, but would be a bit more sensible and buffer data
> for file operations?

I changed it to buffer both input and output files completely in memory
on decompression and got into 7.4s.  By leaving out buffer overrun checks,
it got to 6.5s.  VBCC -O3 gave for latter 7.8s, so VBCC file operations
were indeed worse than MiNTlib.

LZO C-based decompression is still more than twice as fast as this
LZ77 C-source.


> What kind of speeds the lz77 uncompression assembly version gets?

That's quite a bit faster, 1.2s.  Twice as fast as C-based LZO
uncompression (unsafe version as I don't think asm version
does any checks).

Note that the Lz77 ASM code is understood only by Vasm (from VBCC)
& Devpac as it uses some Devpac macros (which are supported by Vasm),
it's not compatible with Gas (from GCC) or AHCC assembler.


> (The reason why I started testing C-versions was that I don't
> have m68k assembly optimized version of LZO uncompression.)
> 
> PS. If somebody's interested on my c-source modifications, I can
> mail them.


	- Eero



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/