Re: [hatari-devel] Hatari profiler updates and CPU cycle questions

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

On torstai 31 tammikuu 2013, Douglas Little wrote:
> The good news is - the DSP profiler output looks much better and symbol
> table conversion and import is working properly. Things do appear to map
> back to the source.

Great!

FYI: I just updated profile.c and post-processor so that latter
automatically identifies DSP profiles, "-d" option isn't anymore
needed or supported.


> The only problem I noticed involved a few cases where the expected cycle
> count (?cyc) did not match (total cycles / count), (evaluated to 4cyc
> instead of 5cyc)....

Isn't it possible that the same instruction gets different cycles on
different invocations?   I think the disassembly output shows the cycles
based on instructions being executed in strictly linear order...?


> possibly an issue assigning cycles to the previous instruction in some
> cases?

I think that should affect only first and last instruction executed in
a profiling session.


> Anyway for the most part it looks right and is
> usable for performance analysis - with just that exception...
> 
> *One 'request' I would have is for more decimal precision on the global
> %age values, since it represents time on individual instructions of which
> there are many - as opposed to whole functions - this means every opcode
> needs much finer granularity for differentiation - 2 digits isn't nearly
> enough. Most opcodes come out at 0.00%, with the highest being 7.xx% or
> so for cases with long blocking waits... probably 4 to 6 digits is
> needed to make this bit useful even on a small DSP program... and t's
> much more severe on large CPU programs.*

To get more digits, you can apply the attached (untested) patch
to Hatari sources.

But I would suggest starting with the post-processor so that you get
function level information, percentages on that should be much
larger.  And then when looking at the instruction level stuff, you
could just look at the counts themselves, not percentages.

If you want meaningful percentages, it might be better to device a test
that runs (mostly) just that function.  One possibility is to set up
breakpoints that start and stop profiling when the function is entered
and exited and each time dump a new profile [1].  I could then do
a separate script that sums together the information in these profiles.

[1] This will need some facility to the debugger which can generate
    a new file name each time it's used, maybe some debugger variable
    that increases its value each time its value is queried.  It could
	be used like this:
		profile addresses 0 file-"increment".txt

Btw. You can already use a variable values in filenames.  For example
following will use PC register value as part of file name:
	> profile addressses 0 pc-"pc".txt
	> q
	$ ls pc*
	pc-$fc1636.txt

There just isn't a suitable incrementor variable, but hardest thing
in that is coming up with a good name for it.  ;-)


> I'll look more closely at the CPU/devpac symbols later today if I can.

You might post-pone it to tomorrow, so that I have time to test
(and potentially improve) user-space program handling first.


	- Eero
diff -r e675970e0a39 src/debug/68kDisass.c
--- a/src/debug/68kDisass.c	Thu Jan 31 15:15:50 2013 +0200
+++ b/src/debug/68kDisass.c	Thu Jan 31 15:47:48 2013 +0200
@@ -2468,7 +2468,7 @@
 			Uint32 count, cycles, misses;
 			if (Profile_CpuAddressData(addr, &percentage, &count, &cycles, &misses))
 			{
-				sprintf(commentBuffer, "%5.2f%% (%u, %u, %u)", percentage, count, cycles, misses);
+				sprintf(commentBuffer, "%8.5f%% (%u, %u, %u)", percentage, count, cycles, misses);
 				Disass68kComposeStr(lineBuffer, commentBuffer, optionPosComment+1, 0);
 			}
 		}
diff -r e675970e0a39 src/falcon/dsp_disasm.c
--- a/src/falcon/dsp_disasm.c	Thu Jan 31 15:15:50 2013 +0200
+++ b/src/falcon/dsp_disasm.c	Thu Jan 31 15:47:48 2013 +0200
@@ -554,7 +554,7 @@
 	}
 	if (offset > 2 && Profile_DspAddressData(prev_inst_pc, &percentage, &count, &cycles)) {
 		offset -= 2;
-		sprintf(str_instr2+offset, "%5.2f%% (%d, %d)\n", percentage, count, cycles);
+		sprintf(str_instr2+offset, "%8.5f%% (%d, %d)\n", percentage, count, cycles);
 	}
 	return str_instr2;
 } 


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/