Re: [hatari-devel] Hatari profiler updates and DSP cycle questions |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/hatari-devel Archives
]
Hi,
On perjantai 01 helmikuu 2013, Douglas Little wrote:
> > I haven't commited the min/max cycles code yet as it makes the output
> > more verbose and I'm wondering how useful it is. I guess for 99% of
> > DSP code, there's no difference between min and max cycles, they
> > always are executed the same way...?
>
> I think the main value is for the programmer to notice that a particular
> block of code has an unexpected (potentially large) penalty due to the
> location of data, more than anything else - and a penalty that perhaps
> doesn't show all the time. i.e. perhaps something that's difficult to
> find any other way except reading all the code and layout.
>
> However this is from a developer/optimisation perspective only :-) i.e.
> this is how I would use it. I don't know if it's of use to anyone
> else....
I changed it from min,max to max-min i.e. diff. That way it's
much easier to notice when it happens and post-processor can
handle the differences as "cache misses".
In doomino demo, I got such thing in only one place out
of 1258 instructions:
---------
....
p:0447 0608a0 (04 cyc) rep #$08
0.38% (960218, 3840872, 0)
p:0448 200032 (02 cyc) asl a
3.04% (7681744, 15363488, 0)
p:0449 0bcc67 (04 cyc) btst #7,a1
0.38% (960218, 3840872, 0)
p:044a 0af0a0 00044f (07 cyc) jcc p:$044f
0.38% (960218, 6721526, 0)
p:044c 45f400 ffff00 (05 cyc) move #$ffff00,x1
0.19% (484216, 2421080, 0)
p:044e 200060 (02 cyc) add x1,a
0.19% (484217, 968439, 3)
p:044f 44ee00 (05 cyc) move x:(r6+n6),x0
0.38% (960219, 4801095, 0)
....
---------
Maximum difference between cycles that "add x1,a" took,
was 3.
Either it took single time 5 cycles instead of 2,
out of half a million calls:
968439-484217*2 = 5
Which seems unlikely, or the call alternated e.g. between 1 and 4:
968439-484217*1-1*2-121055*4 = 0
:-)
- Eero