Re: [chrony-users] Fallback clock on event "Can't synchronise: no reachable sources"

[ Thread Index | Date Index | More Archives ]

Thanks for the answer!
Based on that I will get the tracking statistics and go through the syslog again and write again to this list when I have that.

On 07.02.2014 15:37, Miroslav Lichvar wrote:
On Fri, Feb 07, 2014 at 02:10:32PM +0100, Ulrich Schwesinger wrote:
I have a system with a write protected system partition and an NTP
server running within a local network. Sporadically, chrony reports
the above mentioned error.
The error persists maybe within a time range of 20-30 seconds.
Within that time interval, the system time seems to diverge/drift by
a couple of 10 milliseconds, which seems unreasonable large for me.
How did you measure the error in that period?
It was reported in /var/log/syslog. The effects of the dropout were indirectly noticeable, because I have some other software running that is monitoring timestamps of messages sent via a middleware. These timeout thresholds are configured to raise errors for timestamp deviations larger than 100ms.

(a) What is chrony exactly doing upon the event "Can't synchronise:
no reachable sources". It does not seem to keep the current system
time. Is it falling back on hardware clock maybe?
No, nothing with RTC. It should keep the system clock as it was set on
the last update unless chronyd is configured with the fallbackdrift
No, fallbackdrift is not configured. The configuration looks like this (ntp-server-lan is specified in /etc/hosts, two WAN servers exist but when the error happens, no WAN access exists):

server minpoll 8
server minpoll 8
server ntp-server-lan minpoll 5 maxpoll 7 iburst

keyfile /etc/chrony/chrony.keys
commandkey 1
driftfile /rw/.chrony/chrony.drift
log tracking measurements statistics
logdir /rw/.chrony/logs
maxupdateskew 100.0
initstepslew 10 vcharge-ntp-server-lan
makestep 100 10
dumpdir /rw/.chrony
allow 10/8
allow 192.168/16
allow 172.16/12
logchange 0.5

(b) Or might something be wrong with the drift estimate, causing
this large divergence during the short time interval
Did it switch to another source before the "no reachable sources"
message? If there are multiple sources and the worst source becomes
unreachable as last, the clock will be set by the measurement from the
worst source, possibly introducing a large time or frequency error.
Puh, unfortunately I don't have the log files here right now. But I don't believe it was the case.
As one can see from the chrony.conf, there are additionally two WAN sources (that are never reachable).
Interesting information though.

(c) Are there some files I could look at the trace back what's
happening within these time intervals?
The tracking log would be very helpful here. You can enable logging
in chrony.conf like this:

logdir /var/log/chrony
log measurements statistics tracking
Great, I will collect these files the next time. The should be created.

Mail converted by MHonArc 2.6.19+