Re: [chrony-dev] Bug -- interaction between ntpdate and chronyd at bootup -- will never sync up |
[ Thread Index |
Date Index
| More chrony.tuxfamily.org/chrony-dev Archives
]
- To: chrony-dev@xxxxxxxxxxxxxxxxxxxx
- Subject: Re: [chrony-dev] Bug -- interaction between ntpdate and chronyd at bootup -- will never sync up
- From: ray vantassle <rayvantassle@xxxxxxxxx>
- Date: Tue, 5 Feb 2013 11:50:20 -0600
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=vHHJ4NelZlvxFcYASDvihTg3ri0tRkK/1gfyCAWvTcU=; b=08D1iCLJzDXkhFF1gNUHBHeXwhZsIpy47AAvTy0KhB3Kc4b0gNDuVstun+vOxzBJoO 7tWTlWlfXYV9lTWE/Ol3kvn2LTFcDdnCoWFPzARF5d0thl2p5eqH0utelR/zB/pGgTVC JPif5VDPWDnqFlyop6f8Y34Wn58rxUF16Q3kS03kXeWdUiW9ZQtYXhU2IpOZ5kxow/Yk q3zvzmCoLkYz/wpEjX/0LRIWo2BLV6+wLhIkRBeoFmR8itcALa9DsKnpBWbn3Px/4zeT U93pAgBuS3NW+f4blJDFBxGx3XHVzvGi8Tn2yrR5IEFiRWeVDFMUctdcg3NrMh/cFz0F ywtQ==
It's easy to duplicate.
ntpdate starts up this way:
NTPSERVERS="0.debian.pool.ntp.org 1.debian.pool.ntp.org
2.debian.pool.ntp.org 3.debian.pool.ntp.org"
/usr/sbin/ntpdate $NTPSERVERS
It makes 4 queries to each of 16 servers, for a total of 64 queries.
That takes about 9.5 seconds.
(A bit of overkill -- but that's the config in the ntpdate package.)
At startup, ntpdate is invoked as a result of dhcp coming up or when
the network starts up, via a script in /etc/network/if-up.d/
/etc/init.d/chrony gets invoked about 2-3 seconds after ntpdate
starts, to bring up chronyd.
When all goes well (no overlap) it takes chronyd about 5 seconds from
the time the servers are marked "online" until it says "Selected
source ..."
Once you figure out what's going on -- which is *not* easy, since
timing bugs are a bitch -- it's not too hard to make a work-around.
One workaround is to get the time set very quickly early in startup.
Rdate will do that (under a second). Or ntpdate will if you give it
only a few servers and -p1 option (about 2 seconds). The timing
window is still there, but it doesn't usually happen.
My final workaround was to add a check in the putonline function of
/etc/init.d/chrony, after the "sleep 2":
cnt=0
while [ $cnt -le 50 ] ; do
pidof ntpdate rdate >/dev/null || break
sleep 1
cnt=$(( $cnt + 1 ))
done
This will delay the chronyd startup for up to 50 seconds, until
neither ntpdate nor rdate is running.
I went with rdate instead of ntpdate -- because without a RTC it is
important to set the clock as soon as possible, as early as possible
after bootup. But that's not a chrony issue.
In chrony.conf, I put "makestep 1000 10".