Re: [chrony-users] Tracking lost but server selected

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]

To: chrony-users@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [chrony-users] Tracking lost but server selected
From: Ariel Garcia <a.garcia@xxxxxxxxxx>
Date: Mon, 19 Mar 2018 12:59:39 +0100
Organization: Gemfony scientific

Hello Miroslav,

thanks a lot for your enlightening answer :)
In the meantime i've noticed, that the state is "normalizing" itself after 
several hours: i've seen it remaining w/o reference for anything between
1 to 18 hours) but eventually happening again.

The device is connected via mobile network, with pretty varying and bad RTT's 
to the NTP servers (i've seen anything between 0.1 to 2+ seconds, mean is 
around 0.8s). Peer delay (chronyc ntpdata) is usually ~0.8s also.

> Tracking reporting no reference probably means that the selection
> failed at some point (e.g. no majority reached) and the newly selected
> source (if there is one) didn't have an update yet.
> 
> Is there a message in the system log corresponding to the selection of
> 0.0.0.0?

yes, right, it always prints
    Can't synchronise: no selectable sources
whenever it turns to "Ref time 1970" state

> Was there a network change which could cause the delay to be increased?

well, no from the device but as said above the mobile nw is quite varying, 
delays could be better and are for sure nor regular.

> The sources seem to be failing in the test C, which suggests the
> network delay is larger than it used to be before. chronyd is waiting
> for the delay to get back to normal, but if it is a permanent change
> (e.g. network routing has changed which added more delay), it may take
> a while before the sample with the old minimum delay is flushed, or a
> different source with better statistics is selected.

great! that is a very helpful information.

Is there any parameter i could adjust, to make it less "sensitive" to the 
(bad) network behaviour, like increasing  maxdelay ? (currently i have the 
default value, 3s)
As said above, i've seen RTT's of 2 seconds, but looking sporadically, they 
may get even worse...

Would having less sources (4 intended) help in reaching majority?

> If you look further back in the measurements log, do you see a point
> when all sources suddenly started to consistently fail the test C?

it is rather the other way around: most of the time i see test C failing:
	111 111 1101
and every once in a while you find a server where all bits are set (logs 
attached in case helpful, i can't really see a pattern)

So i guess when the rate of "all good" points falls even more than the normal 
value, i loose the reference?
Any of the parameters printed in the logs (root delay, dispersion etc) helps 
to know when the synchronization is going to be lost?

Thanks a lot for your help and for the great tool :)
Ariel

Attachment: logs.tgz
Description: application/compressed-tar

Follow-Ups:
- Re: [chrony-users] Tracking lost but server selected
  - From: Miroslav Lichvar

References:
- [chrony-users] Tracking lost but server selected
  - From: Ariel Garcia
- Re: [chrony-users] Tracking lost but server selected
  - From: Miroslav Lichvar

Messages sorted by: [ date | thread ]
Prev by Date: Re: [chrony-users] Tracking lost but server selected
Next by Date: Re: [chrony-users] Tracking lost but server selected
Previous by thread: Re: [chrony-users] Tracking lost but server selected
Next by thread: Re: [chrony-users] Tracking lost but server selected

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/