Re: [chrony-users] Chrony malfunction under ARM/Buildroot |
[ Thread Index |
Date Index
| More chrony.tuxfamily.org/chrony-users Archives
]
On Thu, 14 Apr 2016, Mauro Condarelli wrote:
Hi,
I'm having major problems using Chrony in an embedded system sporting an
oldish Atmel AT91SAM9G25.
I am using chrony-2.3 compiled with "--without-seccomp --without-tomcrypt
--without-nss" under Linux 4.4.1.
/etc/chrony.conf is:
============================
#initstepslew 30 2.it.pool.ntp.org 0.pool.ntp.org 1.pool.ntp.org
server 0.it.pool.ntp.org
server 1.it.pool.ntp.org
server 2.pool.ntp.org
server 3.pool.ntp.org
logdir /var/log/chrony
log rtc statistics measurements tracking
logchange 1
driftfile /var/lib/chrony/drift
keyfile /etc/chrony.keys
generatecommandkey
makestep 1.0 3
maxupdateskew 100.0
dumponexit
dumpdir /var/lib/chrony
rtconutc
rtcautotrim 1
rtcfile /var/lib/chrony/rtc
bindcmdaddress /var/run/chrony/chronyd.sock
============================
I have two (possibly related) malfunctions:
1) if and when the RTC is severely off (complete loss of power, including
battery backup) the system comes up with a specific date (Jan, 11th 2007) and
after a while (i.e.: after chrony had the time to synchronize over the
network) system seem to hang with very few systems actually working (notably
the system answers to "ping") so that it seems completely dead. It turns out
it is not really dead, but only in "suspended animation" as it always
recovers completely (with correct date & time) after a LOOOOooonggggg time
(>5min).
With no rtc ( which is a hardware problem not a chrony problem) there is no
way the system can know what the time is or how to make sure everything is
always monotonic. You will just have to make sure that the problem never
occurs.
I have never seen chrony "hang" the system. There is something else going on.
Have you looked at all the logs-- dmesg, journalctl or var/log/messages, ...
to see what is going on? It may be that some program is noticing that chrony
is changing the time, and goes into a snit.
2) if the system comes up well (chrony seems to work as advertised) I
sometimes get very bad system crashes (see below).
Neither of these conditions happen if I kill chrony early (before it has any
chance to synchronize) or if it is inactive (e.g.: if system is disconnected
from the internet).
What should I check?
Can someone help me debugging this?
NOTE: the need is to have a system with strictly monotonic time (i.e.: time
can jump ahead, but can never jump back for any reason, even across reboots).
If necessary I can test with specific patches... or change completely.
Thanks in Advance
Mauro
============================
[ 232.117187] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s!
[kworker/0:1:298]
OK, that has nothing to do with chrony. What are you running?
[ 232.117187] Modules linked in: cfg80211 usb_f_acm usb_f_serial
libcomposite configfs u_serial atmel_usba_udc
[ 232.117187] CPU: 0 PID: 298 Comm: kworker/0:1 Tainted: G W 4.4.1 #1
[ 232.117187] Hardware name: Atmel AT91SAM9
[ 232.117187] Workqueue: events rtc_timer_do_work
So your rtc seems to be having trouble.
[ 232.117187] task: c7a6f420 ti: c799c000 task.ti: c799c000
[ 232.117187] PC is at timerqueue_del+0x1c/0x84
[ 232.117187] LR is at rtc_timer_do_work+0xac/0x1a8
[ 232.117187] pc : [<c01c8e2c>] lr : [<c02c43fc>] psr: 00000013
[ 232.117187] sp : c799de00 ip : c799de20 fp : c799de1c
[ 232.117187] r10: 00000000 r9 : c7b7ea20 r8 : c7edee00
[ 232.117187] r7 : c7b7e9e0 r6 : c7b7e978 r5 : c7b7e9e0 r4 : c7b7ea20
[ 232.117187] r3 : 00000001 r2 : 04405600 r1 : c7b7ea20 r0 : c7b7e9e0
[ 232.117187] Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment
none
[ 232.117187] Control: 0005317f Table: 273dc000 DAC: 00000053
[ 232.117187] CPU: 0 PID: 298 Comm: kworker/0:1 Tainted: G W 4.4.1 #1
[ 232.117187] Hardware name: Atmel AT91SAM9
[ 232.117187] Workqueue: events rtc_timer_do_work
[ 232.117187] [<c0010468>] (unwind_backtrace) from [<c000dad0>]
(show_stack+0x20/0x24)
[ 232.117187] [<c000dad0>] (show_stack) from [<c01c10c0>]
(dump_stack+0x20/0x28)
[ 232.117187] [<c01c10c0>] (dump_stack) from [<c000afa4>]
(show_regs+0x1c/0x20)
[ 232.117187] [<c000afa4>] (show_regs) from [<c006a748>]
(watchdog_timer_fn+0x150/0x1a4)
[ 232.117187] [<c006a748>] (watchdog_timer_fn) from [<c00515b8>]
(__hrtimer_run_queues.constprop.5+0x140/0x21c)
[ 232.117187] [<c00515b8>] (__hrtimer_run_queues.constprop.5) from
[<c005181c>] (hrtimer_interrupt+0xa0/0x1c0)
[ 232.117187] [<c005181c>] (hrtimer_interrupt) from [<c02e26b8>]
(ch2_irq+0x30/0x38)
[ 232.117187] [<c02e26b8>] (ch2_irq) from [<c00469e8>]
(handle_irq_event_percpu+0x7c/0x1dc)
[ 232.117187] [<c00469e8>] (handle_irq_event_percpu) from [<c0046b80>]
(handle_irq_event+0x38/0x4c)
[ 232.117187] [<c0046b80>] (handle_irq_event) from [<c0049a5c>]
(handle_fasteoi_irq+0x9c/0x110)
[ 232.117187] [<c0049a5c>] (handle_fasteoi_irq) from [<c0046188>]
(generic_handle_irq+0x28/0x38)
[ 232.117187] [<c0046188>] (generic_handle_irq) from [<c0046440>]
(__handle_domain_irq+0x8c/0xb0)
[ 232.117187] [<c0046440>] (__handle_domain_irq) from [<c000948c>]
(aic_handle+0xb8/0xc0)
[ 232.117187] [<c000948c>] (aic_handle) from [<c000e524>]
(__irq_svc+0x44/0x58)
[ 232.117187] Exception stack(0xc799ddb0 to 0xc799ddf8)
[ 232.117187] dda0: c7b7e9e0 c7b7ea20
04405600 00000001
[ 232.117187] ddc0: c7b7ea20 c7b7e9e0 c7b7e978 c7b7e9e0 c7edee00 c7b7ea20
00000000 c799de1c
[ 232.117187] dde0: c799de20 c799de00 c02c43fc c01c8e2c 00000013 ffffffff
[ 232.117187] [<c000e524>] (__irq_svc) from [<c01c8e2c>]
(timerqueue_del+0x1c/0x84)
[ 232.117187] [<c01c8e2c>] (timerqueue_del) from [<c02c43fc>]
(rtc_timer_do_work+0xac/0x1a8)
[ 232.117187] [<c02c43fc>] (rtc_timer_do_work) from [<c002c914>]
(process_one_work+0x208/0x388)
[ 232.117187] [<c002c914>] (process_one_work) from [<c002d720>]
(worker_thread+0x2d8/0x430)
[ 232.117187] [<c002d720>] (worker_thread) from [<c0031ce4>]
(kthread+0xe0/0xf4)
[ 232.117187] [<c0031ce4>] (kthread) from [<c000a430>]
(ret_from_fork+0x14/0x24)
[ 232.117187] Kernel panic - not syncing: softlockup: hung tasks
[ 232.117187] CPU: 0 PID: 298 Comm: kworker/0:1 Tainted: G W L 4.4.1 #1
[ 232.117187] Hardware name: Atmel AT91SAM9
[ 232.117187] Workqueue: events rtc_timer_do_work
[ 232.117187] [<c0010468>] (unwind_backtrace) from [<c000dad0>]
(show_stack+0x20/0x24)
[ 232.117187] [<c000dad0>] (show_stack) from [<c01c10c0>]
(dump_stack+0x20/0x28)
[ 232.117187] [<c01c10c0>] (dump_stack) from [<c008c2b0>] (panic+0x88/0x1f0)
[ 232.117187] [<c008c2b0>] (panic) from [<c006a774>]
(watchdog_timer_fn+0x17c/0x1a4)
[ 232.117187] [<c006a774>] (watchdog_timer_fn) from [<c00515b8>]
(__hrtimer_run_queues.constprop.5+0x140/0x21c)
[ 232.117187] [<c00515b8>] (__hrtimer_run_queues.constprop.5) from
[<c005181c>] (hrtimer_interrupt+0xa0/0x1c0)
[ 232.117187] [<c005181c>] (hrtimer_interrupt) from [<c02e26b8>]
(ch2_irq+0x30/0x38)
[ 232.117187] [<c02e26b8>] (ch2_irq) from [<c00469e8>]
(handle_irq_event_percpu+0x7c/0x1dc)
[ 232.117187] [<c00469e8>] (handle_irq_event_percpu) from [<c0046b80>]
(handle_irq_event+0x38/0x4c)
[ 232.117187] [<c0046b80>] (handle_irq_event) from [<c0049a5c>]
(handle_fasteoi_irq+0x9c/0x110)
[ 232.117187] [<c0049a5c>] (handle_fasteoi_irq) from [<c0046188>]
(generic_handle_irq+0x28/0x38)
[ 232.117187] [<c0046188>] (generic_handle_irq) from [<c0046440>]
(__handle_domain_irq+0x8c/0xb0)
[ 232.117187] [<c0046440>] (__handle_domain_irq) from [<c000948c>]
(aic_handle+0xb8/0xc0)
[ 232.117187] [<c000948c>] (aic_handle) from [<c000e524>]
(__irq_svc+0x44/0x58)
[ 232.117187] Exception stack(0xc799ddb0 to 0xc799ddf8)
[ 232.117187] dda0: c7b7e9e0 c7b7ea20
04405600 00000001
[ 232.117187] ddc0: c7b7ea20 c7b7e9e0 c7b7e978 c7b7e9e0 c7edee00 c7b7ea20
00000000 c799de1c
[ 232.117187] dde0: c799de20 c799de00 c02c43fc c01c8e2c 00000013 ffffffff
[ 232.117187] [<c000e524>] (__irq_svc) from [<c01c8e2c>]
(timerqueue_del+0x1c/0x84)
[ 232.117187] [<c01c8e2c>] (timerqueue_del) from [<c02c43fc>]
(rtc_timer_do_work+0xac/0x1a8)
[ 232.117187] [<c02c43fc>] (rtc_timer_do_work) from [<c002c914>]
(process_one_work+0x208/0x388)
[ 232.117187] [<c002c914>] (process_one_work) from [<c002d720>]
(worker_thread+0x2d8/0x430)
[ 232.117187] [<c002d720>] (worker_thread) from [<c0031ce4>]
(kthread+0xe0/0xf4)
[ 232.117187] [<c0031ce4>] (kthread) from [<c000a430>]
(ret_from_fork+0x14/0x24)
[ 232.117187] ---[ end Kernel panic - not syncing: softlockup: hung tasks
============================
So it looks like there is something seriously wrong with your rtc. Hardware.
--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with
"unsubscribe" in the subject.
For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "help" in the
subject.
Trouble? Email listmaster@xxxxxxxxxxxxxxxxxxxx.
--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "unsubscribe" in the subject.
For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "help" in the subject.
Trouble? Email listmaster@xxxxxxxxxxxxxxxxxxxx.