Re: [chrony-dev] chronyd not recovering after time stepped.

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]




William G. Unruh   |  Canadian Institute for|     Tel: +1(604)822-3273
Physics&Astronomy  |     Advanced Research  |     Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology |     unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1     |      and Gravity       |  www.theory.physics.ubc.ca/

On Thu, 27 Aug 2015, Bryan Christianson wrote:

I had a problem with my principal NTP server the other day (different issue) but the result was a large jump in the offset on my Mac. This offset started to recover and then remained at around 70ms.

I can reproduce the problem by stepping the local clock while chronyd is running, using a small program that uses settimeofday() to advance the local clock.

For example:

I allow chronyd to run until fully in sync with NTP with sub 10usec offset

Step the local clock forward by 1500ms

Many people think that a good test of either chrony of ntpd is to step the
clock. This is not a good test.


chronyd reports an unexpected offset in the message log
	chronyd[57399]: System clock wrong by -0.754267 seconds, adjustment started

chronyd starts correcting the offset and overshoots from -1000ms to around +70ms (as reported in tracking.log)

On close inspection, I can that the offset is dropping, but very, very slowly from 70ms to 50ms in an hour, i.e at about 5ppm. Maybe it will recover in a few hours?

An hour? chrony hardly knows things are off after an hour. With a remote host,
it is only asking that host for its time a few times in that hour, and has no
idea what happened to make that remote system behave so horribly. The main defense for a rogue server is not to make chrony jump rapidly to
follow it, but to have a number of servers so chrony can figure out that that
one server was rogue.



Restarting chronyd clears the offset rapidly.

Is this normal behaviour?

Wait and look at the behaviour over hours, not minutes.

The error in which the local clock suddenly jumps long after startup is not a
error type which any system should be set up to handle well.

But if you published the measurements log during that time, and the tracking
and statistics logs one might be able to figure out better what is going on.



I'm really not sure whats going on here and, if it is a problem, where I should be looking to debug it.

Regards
Bryan

--
Bryan Christianson
--
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


--
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/