Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]




William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/

On Thu, 9 Nov 2017, James Feeney wrote:

On 11/09/2017 05:25 AM, Miroslav Lichvar wrote:
So, if there is a large adjustment running and a new measurement says
the offset of the clock is not what expect (e.g. the clock or the
server has drifted for some reason), should we report by how much the
adjustment which is still running had to be adjusted, or the total
amount of the remaining adjustment? Currently, it's the former.

Well, my intuition and naive assumption would be that any allusion to a system clock offset was describing an "absolute offset", relative to the NTP server clock, and NOT relative to some chronyd internal process, which, in this case, would be the slewing process.  And I also naively would expect that other people would make that same assumption.

That is unclear. Chrony knows that it is out by a certain amount. That is why
it is slewing the clock, and in a few seconds or minutes the system time will
be exactly what it thinks NTP time is. It now finds it is out by a second.
Does it report that it is out by a second or by that second plus the three
seconds that it already is correcting.


Any log message that was referring to some offset of the system clock relative to some clock *other* than the NTP server clock, in my mind, should require some very very clear and emphasized language.  Even then, how would that be worded?  I am still not clear about this offset that is being reported!  A log message would seem to require some wording as "involved" as the expression of your example: "the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the  expected 3 seconds."  I don't even know how often "another measurement is made", so I cannot put that offset error in a precise context.

Precisely, so why would you report that?



And, why would anyone care about this "differential slew error"?  Is there some attempt being made to illustrate a possible hardware fault, for instance, where the system clock is drifting faster than the chrony slewing algorithm is able to track?  And then, if that were the case, I would expect a different log message, such as: "Unable to correct system clock - excessive drift."

So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping.  For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds."

The logs are not the place to give a complete description of the theory of
chrony operations. That would just be silly, verbost and totally confusing.
One could put that into the documentation.



What value should be reported if the previous slew didn't finish yet?
That's the normal case. It's just that the slew is rarely larger than
logchange.

My instinctive answer is "always the absolute offset of the system clock relative to the NTP server clock."

Why? Possibly because your naive expectation is "If the clock is out, fix it
immediately" But that is not what chrony does.


But, I don't actually have enough information to understand what is meant by "didn't finish yet".  When, or how often, and in what manner is this "previous slew" measured?  Is there some "current slew" that is distinct from a "previous slew"?  Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock?


Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error. It is
more important that the the clock deliver a smooth time than that it deliver
the correct time. For example, backstepping the clock (so that there are two
times at which the system clock shows 1496549 seconds for example). That kind
of backstepping can cause havoc to the filesystem.


My intuitive assumption would be that the slewing process is simply initiated when chronyd is started, and that the slewing process changes its rate dynamically, however much is required to "catch up" the system clock to the NTP server clock, and that a single slewing process will continue until synchronization is achieved.  Does "slewing" work like that?  Or some other way?

chrony keeps measuring its deviation from the ntp time, and slews the clock,
at various rates, to correct for difference. This takes place all the time.


I have no idea how often chronyd actually "tweaks" the system clock, whether chronyd is in a "slewing" state, or even in a "synchronized" state.  Is there some hard-coded maximum allowed offset before the system clock is "tweaked"?  Or is there a periodic "tweak" rate?  And is this "tweak" interval dynamically adjusted?  Is the system clock "frequency" automatically "scaled" by the kernel?  I know that those are kind of basic concepts, but I really don't know what chronyd is doing at that low level.

It tweaks it every time it makes a measurement. And not to all your questions. You really need to read about the theory of ntpd clock adjustment which is
that same to lowest order as that of chrony.



I'm appreciating your patience with this explanation...

--
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


--
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/