Re: [hatari-devel] Linux boot freeze with Hatari git on 040/060 + MMU + CPU prefetch/cache

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


Hi,

Hm. After removing the few patches that I had applied, the backtrace looks slightly clearer:
-----------------------------
- 1. 0x2a3f7a: atari_stram_init -0x10627e (panic +0x0)
- 2. 0x3a9b78: config_atari +0x42a
- 3. 0x3a8baa: setup_arch +0x1b8
- 4. 0x3a5f3e: start_kernel +0x44
-----------------------------

Profile disassembly indeed shows atari_stram_init() calling panic():
-----------------------------
> d atari_stram_init
atari_stram_init:
$003aa1f8 4e56 0000       link.w  a6,#$0        0.00% (1, 16, 1, 0)
$003aa1fc 41f9 0039 1090  lea.l   $391090.l,a0  0.00% (1, 16, 1, 0)
$003aa202 4a90            tst.l   (a0)          0.00% (1, 16, 0, 0)
$003aa204 57c0            seq.b   d0            0.00% (1, 24, 0, 0)
$003aa206 49c0            extb.l  d0            0.00% (1, 16, 0, 0)
$003aa208 4480            neg.l   d0            0.00% (1, 16, 0, 0)
$003aa20a 23c0 0039 1534  move.l  d0,$391534.l  0.00% (1, 16, 0, 0)
$003aa210 2239 0039 10b8  move.l  $3910b8.l,d1  0.00% (1, 16, 1, 0)
$003aa216 4280            clr.l   d0            0.00% (1, 16, 0, 0)
$003aa218 b081            cmp.l   d1,d0         0.00% (1, 16, 0, 0)
$003aa21a 6d0c            blt.b   $3aa228       0.00% (1, 16, 0, 0)
$003aa21c 4879 0033 eb4e  pea.l   $33eb4e.l     0.00% (1, 16, 1, 0)
$003aa222 4eb9 002a 3f7a  jsr     $2a3f7a.l     0.00% (1, 16, 0, 0)
-----------------------------

Which seems to happen if none of the memory blocks are mapped at address 0 (indicating presence of ST-RAM):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/m68k/atari/stram.c#n68

Memory block addresses are set from bootinto:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/m68k/kernel/setup_mm.c#n110

Which is provided either by TOS boot program, or by Hatari Linux loader (LILO):
https://git.tuxfamily.org/hatari/hatari.git/tree/src/lilo.c#n653


As emulation memory writes are done by LILO before emulation starts, there should be no need e.g. to flush anything.

And while I do not get symbol trace or backtrace when using TOS boot program, Linux boot also seems to freezes (040) or double bus errors (060) with it, so this does not seem LILO specific.


=> Seems there's nowadays some problem with MMU mappings when prefetch and/or caches are enabled?


	- Eero


On 16.3.2024 0.57, Eero Tamminen wrote:
Hi,

On 15.3.2024 12.38, Nicolas Pomarède wrote:
Le 15/03/2024 à 11:30, Eero Tamminen a écrit :
...
[1] If CPU caches & prefetch are enabled, very early on boot Linux double bus errors on 060, and freezes on 040. Same happens also with v2.4.1 though, so it's not a regression.

could be useful to get traces about this, even if it's not a regression.

Using "panic.ini" & "lilo.cfg" from [1]:
-----------------------------
$ hatari --parse panic.ini --log-level debug --fast-forward on \
   --machine falcon --cpulevel 4 --vme none --dsp none --fpu internal \
   --cpuclock 32 --mmu on -s 14 --ttram 64 --addr24 off --cpu-exact on \
   --compatible on --natfeats on -c lilo.cfg --lilo "debug=nfcon"
...
console_flush_all
console_srcu_read_lock
console_srcu_read_unlock
__srcu_read_unlock
__printk_safe_enter
up
__printk_safe_exit
wake_up_klogd
make_task_dead
panic
1. CPU breakpoint condition(s) matched 1 times.
     pc = $2a3e5e
Finalizing costs for 6 non-returned functions:
- 1. 0x2a3e5e: make_task_dead +0x27b49a (panic +0x0)
- 2. 0x00495e: die_if_kernel +0x46
- 3. 0x004aac: trap_c +0x140
- 4. 0x002356: config_atari -0x3a53f8 (trap +0x1a)
- 5. 0x3a6baa: setup_arch +0x1b8
- 6. 0x3a3f3e: start_kernel +0x44
-----------------------------


With 060, panic happens in same place, but through several traps, which may explain the eventual double bus error:
-----------------------------
...
console_flush_all
console_srcu_read_lock
console_srcu_read_unlock
__srcu_read_unlock
__printk_safe_enter
up
__printk_safe_exit
wake_up_klogd
make_task_dead
panic
1. CPU breakpoint condition(s) matched 1 times.
     pc = $2a3e5e
Finalizing costs for 33 non-returned functions:
- 1. 0x2a3e5e: make_task_dead +0x27b49a (panic +0x0)
- 2. 0x00495e: die_if_kernel +0x46
- 3. 0x004aac: trap_c +0x140
- 4. 0x002356: sched_clock -0x43c68 (trap +0x1a)
- 5. 0x049c12: vprintk_store +0x50
- 6. 0x04af3e: vprintk_emit +0x76
- 7. 0x04afe0: vprintk_default +0x14
- 8. 0x04b1bc: vprintk +0x40
- 9. 0x2a4714: _printk +0xc
- 10. 0x0049de: trap_c +0x72
- 11. 0x002356: sched_clock -0x43c68 (trap +0x1a)
- 12. 0x049c12: vprintk_store +0x50
- 13. 0x04af3e: vprintk_emit +0x76
- 14. 0x04afe0: vprintk_default +0x14
- 15. 0x04b1bc: vprintk +0x40
- 16. 0x2a4714: _printk +0xc
- 17. 0x0049de: trap_c +0x72
- 18. 0x002356: sched_clock -0x43c68 (trap +0x1a)
- 19. 0x049c12: vprintk_store +0x50
- 20. 0x04af3e: vprintk_emit +0x76
- 21. 0x04afe0: vprintk_default +0x14
- 22. 0x04b1bc: vprintk +0x40
- 23. 0x2a4714: _printk +0xc
- 24. 0x0049de: trap_c +0x72
- 25. 0x002356: sched_clock -0x43c68 (trap +0x1a)
- 26. 0x049c12: vprintk_store +0x50
- 27. 0x04af3e: vprintk_emit +0x76
- 28. 0x04afe0: vprintk_default +0x14
- 29. 0x04b1bc: vprintk +0x40
- 30. 0x2a4714: _printk +0xc
- 31. 0x3a78de: config_atari +0x190
- 32. 0x3a6baa: setup_arch +0x1b8
- 33. 0x3a3f3e: start_kernel +0x44
-----------------------------


does it work if cpu cache is not enabled but prefetch or cycle exact are enabled ?  ie, is it cpu cache taht cause the problem ?

There's panic in same place regardless of whether "--compatible on" or "--cycle-exact on" is used.

How do I enable cycle-exact, but not cache emulation, I thought "--cpu-exact" option controls both?


     - Eero

PS. sources can be gotten with:
$ git clone --depth 1 --branch v6.8 \
   git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

And kernel built from it easily on Debian:
$ cd linux
$ cp hatari/tools/linux/kernel.config .config
$ sudo apt install bc bison flex gcc-m68k-linux-gnu
$ ARCH=m68k CROSS_COMPILE=m68k-linux-gnu- make -j4 vmlinux

[1] input files for Hatari:

-------- panic.ini ----------
setopt --trace cpu_symbols
history cpu 256
profile on
a panic
-----------------------------

--------- lilo.cfg ----------
[LILO]
Kernel = vmlinux
Symbols = System.map
Args = video=atafb:sthigh console=tty
KernelToFastRam = FALSE
Ramdisk =
-----------------------------





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/