Re: [hatari-devel] TT emulation crashes when there is no ACSI drive

[ Thread Index | Date Index | More Archives ]


On 8/6/19 9:27 AM, Christian Zietz wrote:
I only now took a closer look where things are failing:
    Bus Error writing at address $23000002, PC=$fa005c addr_e3=fa005c

I just noticed that Uwe's first mail is from May, whereas Thomas'
cartridge changes are from June.  Hatari built from commit just
before Uwe's mail gives same code for $fa005c address though.

It seems like either a6 got overwritten load_n_reloc, or it
somehow cause stack area pointed by a6 to be overwritten.  Or
TOS gives "bad" data for Pexec() during AUTO/ in your case,
when TT-RAM is present (which I can't reproduce).

To me, accesses to $23xxxxxx always look suspiciously like stack
corruption from an exception (where $23xx is the status register).
However, I don't see how corrupted stack content would end up in A6.
If you look at the cartridge code in my mail, stack pointer is first
moved to a0, and after saving a6 to stack, a0 is moved into a6, then
the extra subroutines are called, after which comes the failing
        clr.l   2(a6)

FYI: full cartridge code used by GEMDOS HD emulation for Pexec()
handling is here:

Generally, a bug you cannot reproduce is a bug you cannot fix. Thus, I
think this won't be solved without Uwe's cooperation. Maybe, in case Uwe
changes his mind, you could instruct him how to capture a CPU
instruction trace up to the point where the exception happens. This
often helps me understand seemingly weird effects.

1. Put non-stripped version of NF_SCSI.PRG to AUTO folder

2. Save this to debugger.ini file:
------------ debugger.ini ----------------
# debugger automatically loads symbols for last program opened through
# GEMDOS HD, when it's invoked. To invoke it at suitable point, set
# breakpoint to next OS call after opening, i.e. Fread() used by
# cartridge code to read the program binary to memory
breakpoint GemdosOpcode = 0x3F :once

# some useful trace output for pinpointing the issue
trace os_base,scsidrv,cpu_symbols

# enter debugger on bus error
setopt --debug-except bus

# catch CPU instruction history
history cpu 256

3. Run Hatari with it using Uwe's config:
hatari --parse debugger.ini --machine tt --natfeats on --cpu-exact off --compatible off --addr24 off -s 8 --ttram 16 --tos tos306.img gemdos-dir/

4. After Hatari enters debugger on Pexec Fread(), and reads program
   symbols, do:
	> setopt -D  # enable exception catching
	> c          # continue

5. After bus error causes Hatari to drop into debugger,
   do following debugger commands:
	> history 64  # history of executed instructions
	> registers   # reg values on exception

6. Send all Hatari output of above to the list

As the original report is several months old, it's preferable to do
this with latest Hatari Git version after testing that the issue
isn't something that's already been fixed.

Or -- if they are other users of the NF SCSI feature -- someone else
could try reproducing the bug. (As you know, I run Windows where this
isn't even available, afaik.)

The place where bus error happens, is during program loading, not
its execution.  So NF SCSI shouldn't actually be needed to reproduce
this, *if* the crash actually happens at NF_SCSI.PRG loading.

To verify this, I want either output from above, or at least from:
	--trace os_base,scsidrv

	- Eero

Mail converted by MHonArc 2.6.19+