Re: [chrony-users] Isolated time domains

[ Thread Index | Date Index | More Archives ]

On Tue, 3 Dec 2013, Tomalak Geret'kal wrote:

On 03/12/2013 01:16, Bill Unruh wrote:
 On Tue, 3 Dec 2013, Tomalak Geret'kal wrote:

>  On 03/12/2013 00:58, Bill Unruh wrote:
> > > > > > In my test setup I have the master server's clock > > set back about 3 hours > and Chrony appears to be working great and > > is moving time towards what > is reported by the external NTP > > servers: > > > > Why? Is this a rediculous test? Why would you expect the master server > > to
> >   be 3
> >   hours out?
>  Bill,
> > I typically find testing the potentially unexpected to be a fairly > worthwhile use of my time, certainly in critical systems.

 So, if for some reason your master were suddenly out by 3 hours, you would
 want all of the clients to also be out by 3 hours, and to play catchup
 the master (the clients would all lag, depending on their poll intervals
 times because the master is racing along with a huge (>>500PPM) drift rate
 trying to fix its own offset. Your scenario would have the clients far
 scattered in time than if they were all discplined to UTC all along.

 What error scenario do youhave in mind where the master could suddenly
 itself out by 3 hours?

You know, it's the unforeseen circumstances I like to prepare for. It's all well and good having some little list of scenarios that you know you can handle, but I prefer to have as robust a solution as I can, so that when the unexpected happens I have the greatest chance of recovery.

YMMV I suppose...

But you also have to have some idea of what you want the system to do in that
case. Do you want the clients to remain as close to the database time which I
assume is on UTC as possible? Do you want them to try to follow the master
around (with all of the huge jumps and changes in drift rates, and huge time
offsets that would imply)? The latter seems horrible.

The master goes out by 3 hours. each of the clients would poll the master, and
eventually decide that they were each out by 3 hours. They would start slewing
their clocks to try to catch up. In the meantime the master is slewing its
clock to try to get back to UTC. After the clients had ramped up their slew
rate, an the master had ramped its up in the opposite direction, the two times
would eventually meet, Now the master would be closer to UTC than the clients
but they are still slewing away from UTC to catch up to where the master was.
But now they decide that the master is elsewhere, and finally stop slewing
like mad away from UTC and start slewing like mad toward UTC. The master by
this time has settled down, but the clients are still way off, and have to try
to get back to UTC. And each client takes its own trajectory, depending on its
poll rate and exactly when it polls.

This is in contrast to the master deciding it is 3 hours off, and starts
slewing its own clock like mad to get back to UTC. In the meantime it delivers
UTC to the clients, who calmly do not even know anything is wrong and all stay
in sync with each other and with the database while the master straightens
itself out. Which is the more resonable behaviour? Only you can say.

It is one of the reasons why ntpd behaves so horribly (to me). by pegging
all slewing to 500PPM ( which would take about a month to get rid of a 3 hour
offset if it were just by slewing) and jumping the time if off by more than
128ms. But that gives huge discontinuities ( including jumps backward in


William G. Unruh   |  Canadian Institute for|     Tel: +1(604)822-3273
Physics&Astronomy  |     Advanced Research  |     Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology |     unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1     |      and Gravity       |

To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject. For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.

Mail converted by MHonArc 2.6.19+