[chrony-dev] makestep command sometimes makes chrony stop reading its sources

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]


Hi,
 
First I have to say I'm quite impressed with the new 1.24-pre1 release of Chrony on Linux! I was using the ntpd implementation to synchronise the system clock using a SHM and PPS reference clock from a GPS card. But it converges very slow and because the system is restarted often we were looking for a faster converging ntp server. As Chrony is able to do this in minutes I'm currently investigating if Chrony is reliable enough to replace ntpd for production use.
 
So far the results are very good under normal conditions. When a properly synchronised SHM and PPS clock is provided and the rtc clock was already somewhat in sync the system converges in minutes after a cold boot! Also temperature changes are corrected a lot faster by Chrony than normally done by ntpd. Sometimes ntpd was off for milliseconds for hours when the temperature increased with 5 degrees, but Chrony was only drifting 300 nSec for half an hour.
 
The tricky part ofcourse is handling the situations where the GPS time is not synchronised or when the RTC is way off. So far I've found two issues with Chrony regarding these border cases. In this mail I'll describe my first issue. The second one will be in the next mail to properly separate the discussion of both issues.
 
What I'm testing is how chrony recovers when the RTC is way off during boot. In the bios I adjust the RTC clock with one year. Because Chrony is only slewing the clock it would take ages to recover from this situation, so I made a script that checks if the offset is too large (more than 30 seconds), and if so, issue a "makestep" command through chronyc to chronyd. This command immediately causes the system clock to be in sync with the GPS clock. Chrony keeps working fine when I adjust the RTC clock back one year in the past in the BIOS, but when I advance the RTC clock by one year Chrony stops reading its sources after the makestep command. It does not recover anymore. I had this running for 15 hours and the sources command told me the LastRx was 15h for all sources. This problem is reproducable.
 
I think there are some absolute timestamps used in chrony that are not updated when makestep is issued.
 
For now I've solved this issue by completely restarting chronyd after issuing the makestep command. It is a drastic measure but at least the system recovers from a wrong RTC clock.
 
If you need more information or you want me to test patches regarding this issue, don't hesitate to contact me!
 
Kind regards,
 
Tjalling Hattink
 
For reference, my configuration file looks like this:
 
# driftfile
driftfile /mnt/intcfdrive/cache/chrony.drift
 
# dump measurements dir
dumpdir /mnt/intcfdrive/cache/chrony.dump
dumponexit
 
# allow access from any NTP client
allow
 
# set linux hz
linux_hz 100
 
# log data
log tracking
logdir /mnt/intusbdrive/ntp
 
# rtc configuration
rtcfile /mnt/intcfdrive/cache/chrony.rtc
 
# chronyc security
commandkey 1
keyfile /etc/chrony.keys
 
# reference clocks
refclock SHM 0 refid SPMC delay 0.1
refclock PPS /dev/pps1 refid PPSE


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/