Re: [chrony-dev] Running chronyd without syncing system clock |
[ Thread Index |
Date Index
| More chrony.tuxfamily.org/chrony-dev Archives
]
On 23/02/2012 08:24, Leo Baltus wrote:
Op 22/02/2012 om 23:07:51 +0000, schreef Ed W:
In our setup we do not like to pin a service to a specific piece of
hardware. If, for some reason, a service should run elsewhere we just
stop it en start it elsewhere. bind() make is invisible for the outside
to see and firewalls do not need to know about it either. This is what
we do for all our services, except ... ntp
I do something similar, but it later occurred to me that it serves
no useful purpose to put two ntp servers on a single clocked
machine?
This is exactly why I want to separate the systemclock sync from the
networkservice so that each instance serves a specific purpose.
Hmm, one of us has got the wrong idea I think? Miroslav - is it me?
My thought process (please knock it down) is:
- We can't know what the "correct" time is, all we have is a bunch of
measurements from a variety of sources that are assumed to have various
random errors associated
- Based on some heuristics we pick one of these inaccurate sources to
sync against, being fully aware that we can't correctly measure the
source, only measure it give or take some error term (which we hope will
average through time)
- Because the source isn't a constant high resolution tick we need some
local high res clock to use for all normal clock requirements. This
clock is also inaccurate so we have a combined problem to measure the
inaccuracy of our local clock vs the source clock.
- With a local high res clock and an occasional glimpse at an upstream
clock assumed to be accurate, we can use the two things and estimate the
"correct current time" based on offsetting the local high res clock
using a bunch of maths.
- Note that the local clock doesn't normally match the upstream clock,
we initially skew it close over some time period and then continually
monitor it computing some error term based on it's observe deviation
from the source
Now you could:
a) Run one process to sync the local clock, and one or more separate ntp
processes to sync that clock to other consumers via ntp. All ntp
processes serve the same single, physical, hardware clock.
b) Run multiple processes in virtualised spaces, with virtualised clocks
that drift from the real hardware clock. Each process computes it's own
clock error term and serves that via NTP to consumers.
c) Run single process to sync the local clock AND serve NTP to
consumers. Allow NTP process to answer requests from multiple IP
addresses, ie masquerade as multiple NTP servers.
The problem with a) is that the separate NTP processes have no knowledge
of the sync state with the upstream source clock. All they can do is
serve the physical hardware clock (which is of course being skewed
periodically by some separate NTP process). I don't know how well that
works in practice, but certainly it seems redundant to have multiple NTP
processes serving *a single clock*, additional processes have no new
knowledge and hence no obvious advantage over a single NTP process.
Problem with b) is that you are faking several inaccurate clocks from an
upstream "inaccurate" clock... It doesn't seem obvious that a
virtualised clock which is allowed to drift from the real hardware clock
can be anything other than more unstable and less accurate than the real
hardware clock? So now we have multiple processes trying to sync a
clock which is the composite of a clock with two sources of jitter/drift
(real hardware clock + virtualisation inaccuracies). Therefore this
seems less optimal than having the virtual machines use the real
hardware clock (and now only one process can be in charge of
conditioning the clock again)
c) seems most optimal. One NTP process per real physical clock. That
single process then has complete knowledge of the modelled inaccuracies
of the hardware clock and the upstream source clocks and can make an
integrated decision on what to supply to downstream clients.
I think you prefer either a) or b), but from what I can see they both
have significant disadvantages in terms of accuracy and seem quite
redundant? Please shoot down the logic of the above! (be gentle...)
My solution was to pin NTP instances to hardware and if
they go down then they go down (do you care?) - if you do care then
why not make the failover system be something which pushes IPs to
working instances (so some individual instances might appear to be
two servers) rather than instances which know their IPs..?
In that case I cannot bind them to a specific IP adress which is needed
to be transparant for firewalls in and outside my network.
If so then that seems to only be a limitation of your current
virtualisation system? I don't see any reason why it can't be done easily?
At the simplest if your IPs are only used for the NTP process then you
can literally just attach one or more IPs to an existing virtual machine
and it will answer to all those IPs. I haven't double checked, but as
far as I know you can have chrony bind to multiple IPs and so it will
answer all of them? Plenty of options exist to shuffle IPs between
machines, some with very high speed failover
Another option is some kind of front end load balancer/nat. Sounds like
you don't desire that kind of option, but it might be the most
straightforward for those with a firewall in front of the services?
Slightly crude, but you could use iptables DNAT to forward packets from
the downed service. I see no reason why this shouldn't work adequately
but likely it doesn't fit neatly into your virtualisation system so I
suspect it's the least desirable. Basically the idea would be that if
the (one) virtual server is downed on a piece of hardware, then you
bring up some local firewall rule on that physical machine to proxy
incoming connections to some other machine.
Consider you have ntp[1-3].example.com. for some reason ntp2 fails.
The options seem to be:
- Only run ntp[1,3] and leave ntp2 not answering. Most consumers will
set multiple ntp sources and so this should be invisible
- Have ntp1 answer both ntp1 and ntp2 IPs. This helps consumers who
only choose a single source, but will potentially skew consumers who see
ntp[1-3] as their sources since they will appear to see two sources with
strongly correlated performance?
- Leave ntp2's IP dead, but change your DNS to point ntp2 to some other
IP. Multiple issues for high availability, but probably satisfactory
for many situations
I guess you need to think about the above first because it presumably
limitations of your downstream clients define the most appropriate solution?
I'm really struggling to see any benefit in running more than one NTP
process per real, physical clock? If it's imperative that something
answers on a particular IP address then it seems more optimal to have
one of the still running ntp processes take over that IP?
Where have I gone wrong?
Good luck with whatever you pick! Please do share your final solution -
I have the same challenge! (My idea is simply that if an NTP machine
goes down, then it goes down...)
Ed W
--
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble? Email listmaster@xxxxxxxxxxxxxxxxxxxx.