Re: [hatari-devel] Hatari profiler updates and CPU cycle questions

[ Thread Index | Date Index | More Archives ]

Hi Doug,

Could you send a little text file with an example of this ?
I can check from the DSP source code if needed.


----- Mail original -----
De: "Douglas Little" <doug694@xxxxxxxxxxxxxx>
À: hatari-devel@xxxxxxxxxxxxxxxxxxx
Envoyé: Jeudi 31 Janvier 2013 15:16:32
Objet: Re: [hatari-devel] Hatari profiler updates and CPU cycle questions

> The only problem I noticed involved a few cases where the expected cycle 
> count (?cyc) did not match (total cycles / count), (evaluated to 4cyc 
> instead of 5cyc).... 

Isn't it possible that the same instruction gets different cycles on 
different invocations? I think the disassembly output shows the cycles 
based on instructions being executed in strictly linear order...? 

It's possible although the DSP architecture intentionally limits that sort of behaviour - it is supposed to be a 'constant time' processor in most respects. However, it's not always true - if data is being fetched from the address range crossing P:$0100 / XY:$0200, since below those addresses is internal memory with 100% parallel bus access, and above those addresses means serialised access / competition. 

OTOH, I would expect that sort of switching behaviour to be rare in my case because I only use internal memory for fast access stuff, and not for advancing buffers which may spill over... 

P: addresses can't lead to variable timing as the code is not copied around/relocated (yet), so the instruction would need to be P:EXT and the data would need to vary via an addressing mode across XY:INT/EXT for unstable timings in my specific program and I don't think this happens - and not in the case I observed. 

Anyway I'll be able to check more closely as I get used to profiling with this. If I become sure it is wrong in any way I'll report detail. For now it's just a suspicion. :-) 

To get more digits, you can apply the attached (untested) patch 
to Hatari sources. 

But I would suggest starting with the post-processor so that you get 
function level information, percentages on that should be much 

Yes of course - however I didn't want to depend too much on the post-processing just yet - more layers of conversion means more potential problems (until it all has time to settle) so having those extra digits is handy for confirmation just now. 

I do expect later it will become mostly redundant as the post-processor makes the output more manageable... 


Mail converted by MHonArc 2.6.19+