Re: [hatari-devel] Structures in Hatari according to pahole

[ Thread Index | Date Index | More Archives ]


On sunnuntai 16 kesäkuu 2013, Nicolas Pomarède wrote:
> I'm quite skeptical this would really yeld to a perceptible gain of
> performance. Today's CPUs have L2 and L3 cache too, so even if there's a
> cache miss at L1, it will certainly be in L2 (if we consider dsp and cpu
> are a big part of the emulation, their data structure is certainly in L2
> or L3). CPUs will also do instruction reordering, so maybe the data is
> not in L1, but if instr are reordered, the cost of accessing L2 or L3
> won't matter that much.
> Also, admitting important members are defined in the structure as 32
> consecutive bytes, how do me know the data will really be on a 32 byte
> boundary ? We could have 12 bytes before and 20 bytes after, there's
> nothing in standard C that guarantees you get the real data located to a
> specific location once the program start (or you have to malloc memory
> yourself if you want so).

Good point about it being statically allocated data.  For those C
guarantees struct alignment just to the largest structure member
(e.g. on ARM 8 for struct members if I remember right).

> I think this is a job for the compiler ; there could be an option in the
> compiler that allow to rearrange structure's member to the best
> depending on the CPU architecture,

C standard disallows compiler from ordering structure members.

This is for legacy reasons (starting from RPC code from 80's), there's
huge amount of code that assumes members in structure are in given order,
so that one can e.g. memcpy/set just part of structure.

> but handling this ourselves by moving structure's members on
> the assumption of a possible cache design doesn't seem useful to me

The rule is really simple, keep members at their natural
alignment from the structure start.  One might also consider
changing type sizes if they were unnecessarily large, but
that might also affect performance.

> (I'd rather have structure members grouped by their
> logical meaning, to keep the code understandable and maintainable)

That's typically the best, as they're normally accessed together...

Anyway, before doing changes, one should first have some profiling
data which tells that there actually is a trouble spot.  My mail
was more FYI. :-)

> I agree this could help on old cpu (like the 68020/68040) where
> data/instr caches are so small that you need to fill them in the most
> optimal way, but even the cpu of a cellphone now has more cache than the
> ST had of total ram :)
> > Besides the potential DSP core changes, this could have largest effect,
> > if Hatari would actually use the HD6301 emulation (that seems to be
> > built-in, although I thought it was disabled as it's unused)...
> hd6301 is disabled and unused (and also not complete in fact :) So it
> couldn't be used anyway to run the IKBD's ROM at this point)

Why it's compiled into Hatari binary?

There are a lot of symbols from hd6301 code in Hatari:
$ readelf -s src/hatari |grep hd6301 |wc -l

$ readelf -s src/hatari.winuae |grep hd6301 | awk '
	BEGIN{size=0} {size+=$3} END{print size/1024, "kB"}'
44.0518 kB

	- Eero

Mail converted by MHonArc 2.6.19+