Doug,
A little question :
I've always considered that a call to P : X and Y simultaneously
should add +2 (0 for P: +1 for X access and +1 for the Y: access).
I've never found anywhere if this is the correct value ? (+1, but
it could be +2 per access, or +3, ...)
Can you confirm this +1 per extra access to external memory ?
Regards
Laurent
Le 31/01/2013 21:35, Laurent Sallafranque a écrit :
Hi,
I've given a look at the example you sent us :
We must be very careful about what we speak about here.
In your example : p:023e 57eb00 [05 cyc] move
x:(r3+n3),b 0.02% {11300,
45200)
When you trace your program (ie, the program is running with the
registers in the good state), the cycles for this instruction
will display [04 cycles] if X memory w< $200 or [05 cycles]
if X memory is > $200
So, while executing your code, the value was always [04 cycles]
(as you said X is always in the internal memory) and Eero's
profiler is good.
So what ? :)
At the end, when Eero calls the profiler for each instruction,
the DSP "executes" each instruction and returns to the initial
state (to be able to count the cycles).
But in this case, the values of R3 or N3 may be > $200 and
the DSP adds 1 to the cycles.
You can't guess the amount of cycles an instruction will
consume, except when you really run your program and TRACE it.
When you DISASM the DSP memory, the cycles are given accordingly
to the value of the registers, which are constant and probably
not as they're in dynamic running.
So, when TRACING code, the cycles before the instruction are
correct.
when DISASEMBLING the DSP memory, they can be wrong, but Eero's
sum of cycles seems good.
(you can see this in many other parts of your file :
5 lines after
3 more lines after,
...
Just ask again if everything is not clear.
Regards
Laurent
Le 31/01/2013 20:06, Douglas Little a écrit :
BTW I sent a DSP profile/disassembly listing earlier
for Laurent but I don't know if it made it to the mailing list -
it had a 160kb attachment?
D.
On 31 January 2013 19:04, Douglas
Little <doug694@xxxxxxxxxxxxxx>
wrote:
With the last revision
of the Hatari code (#d9c87d1f668d)
it looks like the symbol import and disassembly is now
matching the sourcecode for my case:
Disassembly:
[...]
flush_visplanes:
14a482
: b27c 0011 cmp.w
#$11,d1 0.01% (1959, 4294967295,
1959)
14a486
: 6648 bne.s
$14a4d0 0.01% (1959, 718, 0)
Sourcecode:
flush_visplanes:
cmp.w #$0011,d1
bne.s .no_full
(...although the CPU
cycle counts are still wild, even in a debug build).
D.
|