Re: [hatari-devel] dsp profiler mods

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

> a: most recent cycle count for this instruction including any EXT: memory
> penalties
> b: most recent EXT: memory penalty for this instruction

I'll keep in profiler disassembly output only things that are about
the whole profiling run.

Yes in fact this one was only added because the field became available - it's not particularly important (I realize it is erratic for disassembly-dump purposes so it's just there for tracing in case I want it sometime)


> c: percentage of relative use for whole session (addr.count/all_count)
> d: percentage of relative cost for whole session (addr.cycles/all_cycles)
> e: total use for whole session (addr.count)
> f: average cycle count actually encountered at this address, including
> EXT: memory penalties, for whole session (addr.cycles/count)
> g: average EXT: memory penalty encountered at this address, for whole
> session (addr.extmem_cycles/count)
> h: cycle variance (addr.cycle_diff)

Is the last one max difference like the Hatari code in repository shows,

Yes the last one is unchanged - it's exactly as you left it. 

And which one would be more useful?

I think it is just fine as it is, because the average EXT penalty column (g) indicates the overall impact and you can infer most of what is interesting from the two values. Knowing the exact variance might be interesting but probably is more detail than required for most purposes. Knowing that an address 'wobbles' by 2 or 4 cycles, and the wobble incurs - for example - 0.35 cycles overall is already quite useful for guiding changes.

I'm not going to report two percentage values, but after both CPU & DSP
cycles values have been proven to be fairly correct, I can change percentage
for both to be about cycles, not about instruction counts.

That seems reasonable. It is mainly the relative cost (cycles) I look at here.

However, I'm not going to add/use averages as that will ruin post-
processing.  It would be e.g. sum of all EXT: access costs
(all_cycles - ext_cycles).

I think exporting sums is fine, since the the averages can be derived from that. I just report them as averages because I'm not post-processing anything before viewing and it's convenient for me just now.
 
Main reason why I don't use explanations for the values is that the output
is already really wide/verbose, and I expect profiler columns to be fixed
by the time we do next Hatari release, and not change anymore.

Yes understood.

> It required some changes around the dsp core and profiler to record the
> extmem_cycles info, but it's not very intrusive.

Could you post your patch for that?

Yes I'll post it later.
 
Have you looked at the collected function level values at all?

Only for the CPU, but not for DSP. (I will take a look at it soon using the original Hatari code - but I was looking for specific things short term, which prompted my changes).
 
I think they're quite useful in pointing out code addresses that one
should look closer in the profiler dissasembly output (and what functions
one should check in the callgraph, if that's a complex one).

One can miss things when looking just at the disassembly output as there
can be quite a bit of it. :-)

Yes that is true, although there is quite a different balance of functions/depth vs instruction-level costs on the two processors so there aren't very many functions to look at in my case (maybe 20) and the callgraph is almost flat (max depth of 2 or 3 - most often a depth of 1).
 
Typically DSP code is either an interrupt service, or a command service - both of which wait for an event, do some work then return to idle state. There isn't much high level control flow going on - if there is any control flow it tends to be conditionals inside an algorithm or other processing block. Of course this isn't always the case, but it most often is.

Most DSP-side performance issues are either identified by a HOST port bottleneck (or more interestingly on the DSP side - a lack of one) or unexpected penalties due to location of things.

D.


D.



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/