Re: [chrony-users] Chronyd unexpected abort after server was set to "online"

[ Thread Index | Date Index | More Archives ]

On Mon, Jun 16, 2014 at 11:34:16AM +0200, Arndt Kritzner wrote:
> Hi,
> we are using chrony on a device with AVR32 CPU since a while and this seemed to work. Bu today I checked the function
> and experienced, that chronyd always aborts, after servers became online. Chronyd start looks normal:
> ~ # chronyd -R -d
> main.c:355:(main)[13-13:50:51] chronyd version 1.29 starting
> sys_linux.c:1022:(get_version_specific_details)[13-13:50:51] Linux kernel major=3 minor=4 patch=77
> sys_linux.c:1080:(get_version_specific_details)[13-13:50:51] hz=100 shift_hz=7 freq_scale=1.00000000 nominal_tick=10000
> slew_delta_tick=833 max_tick_bias=1000 shift_pll=2
> But after setting servers to "online" through chronyc chronyd closes:
> ntp_core.c:1575:(NCR_TakeSourceOnline)[13-13:51:42] Source online
> chronyd: sourcestats.c: 345: find_best_sample_index: Assertion `elapsed >= 0.0' failed.
> Aborted
> ~ #

Hm, that's interesting. Can you get a backtrace for this crash or get
chronyd output with this patch, so we can see the value of elapsed and
the number of samples?

--- a/sourcestats.c
+++ b/sourcestats.c
@@ -342,6 +342,7 @@ find_best_sample_index(SST_Stats inst, double *times_back)
     j = get_buf_index(inst, i);
     elapsed = -times_back[i];
+    LOG(LOGS_INFO, LOGF_SourceStats, "n=%d i=%d best=%d elapsed=%e", inst->n_samples, i, best_index, elapsed);
     assert(elapsed >= 0.0);
     root_distance = inst->root_dispersions[j] + elapsed * inst->skew + 0.5 * inst->root_delays[j];

I guess this is also present in the latest 1.30-pre1, it might help us
to see the complete output of "chronyd -d -d" when compiled with

> /etc/chrony.conf:
> server offline
> refclock SHM 0 offset 0.0 delay 0.2 refid GPS
> refclock SHM 1 offset 0.0 delay 0.0 refid PPS
> refclock SOCK /tmp/chrony.ttyS3.sock
> driftfile /etc/chrony.drift
> keyfile /etc/chrony.keys
> commandkey 1
> makestep 1000 10
> initstepslew 30

The config looks ok to me.

> Any explanations for this behaviour? And any clue to solve it? The internet connection does not exist permanently and
> switches between LAN and cellphone network. That's the reason we use "offline"/"online" switching.

I think in older versions this could happen when something other than
chronyd stepped the system clock back and an "out of order" sample was
accumulated in chronyd. A check for that was added in 1.27, so it
shouldn't happen with 1.29. This looks like another bug and it needs
to be fixed.

Thanks for the report.

Miroslav Lichvar

To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx 
with "unsubscribe" in the subject.
For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx 
with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.

Mail converted by MHonArc 2.6.19+