Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DN

Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]

To: chrony-users@xxxxxxxxxxxxxxxxxxxx, Rob Janssen <rob@xxxxxxxxxxxxxxxxx>
Subject: Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
From: Stephen Satchell <list@xxxxxxxxxxxx>
Date: Wed, 20 Dec 2017 13:34:22 -0800

On 12/20/2017 11:51 AM, Rob Janssen wrote:

A time server that uses DNS based rules for reference servers shouldfail gracefully when the DNS does not returnan IP address (anymore). So, when it does a lookup only once it shouldissue an error message about that server,and proceed its startup as if that server was never there in theconfiguration. When it is resolving DNS names ona regular basis (e.g. once per day), it could keep the serverconfiguration and keep retrying the DNS lookup at
that same interval and start using the server when the DNS lookup succeeds.
Not starting the service at all is only an option when all the DNSlookups have failed (i.e. there is no server) andthere is no mechanism to re-try the lookups. When there is, it is muchbetter to keep the service running.(after all, a network may not be available at boot time and may becomeavailable later)

I find this statement of behavior (treat NOSERV/NXDOMAIN as an excuse toforget a server/peer/pool) a bit astonishing, and very un-Unix-like.


Let's make some assumptions:

1. The daemon software has, in its data structures forserver/peer/pool, the FQDN for each server and peer.2. The daemon software, on NXDOMAIN or no answer, sets the IP addressto zeros (0xFFFFFFF for IPv4, and00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 for IPv6)3. All information about the server/peer/pool entry is in the datastructure, such as filter data4. The polling loop is able to fork a process to perform DNS lookups.(This many not necessarily be true with Windows.)

So the standard polling loop uses the poll timing specified in theserver/peer/pool command for all servers, peers, and pools, initializedor not. If the poll interval has expired for a given server/peer/poolentry, it does this:a. IP address zero: reset pool interval to minpoll, and fork aprocess to do DNS lookup -- the forked process will perform the DNSlookup, and on success will fill in the IP address and set thefirst-time flag so the polling loop will pick it up in the next cycleb. IP address non-zero and first-time flag set: do what the servercurrently does with a new server or peer entryb. IP address non-zero and first-time flag not set: do what it doesnow.

Forking a process means that the daemon's polling loop doesn't lock upthe daemon on the DNS lookup when there is no DNS available, or it takesa double-handful of seconds to get NOSERV or NXDOMAIN. (If a process isalready forked for an entry, then don't fork it again; wait for theforked process to die.) If/when the forked process gets a successful Aor AAAA record, it sets it in the data structure for the entry so thatthe pool loop will pick it up on the next poll interval expiration.

Also note that it eliminates special start-up code. The config fileparser fills in the data structure for each server/peer with zero IPaddress, and the polling loop handles the lookup and initialization.This also works with chronyc(1): it causes chronyd(8) to build the newdata structure, and the polling loop does the rest. When you usechronyc(1) to remove a server or peer, chronyd(8) just removes the datastructure for that entry. Poof.

And that's how I would remove chrony's current astonishing behavior inthe face of DNS not being there at start-up. Like in my power-failsituation, where the edge router with chronyd(8) comes up before theCSU/DSU to the network. Enterprise users might be surprised to learnabout this astonishing forgetfulness of chronyd(8) in the face of atemporary failure.


How to handle entries where the NTP server has gone away?

Keep a TTL timer, set by an entry in the configuration file.(reasonable default would be 24 hours.) When "reach" is not 0x00, resetthe TTL timer. When the TTL timer expires, clear the filter variables,set the poll to minpoll, zero the IP address, and reset the TTL timer.

The rationale for this method of handling extended tempfail is the samerationale used for SMTP daemons: wait somewhat impatiently for theremote server to come back, and if it doesn't come back in a reasonabletime then bounce the mail.

From the standpoint of NTP protocol, a server that is out of servicefor an extended time may have different properties when it comes backon-line. (Replaced, for example.) So the filter variables wouldcontain bogus data, particularly in a pool situation where you wereoriginally talking to a "close" server, and now switched to a "far" server.

(And, it eliminates the need for a separate "pool" command, which wouldhelp some distribution sources (<cough> Red Hat) who use "server" whenthey mean "pool" in their default configurations.)


If this should be moved to chrony-dev, I can do that.

--

To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxxwith "unsubscribe" in the subject.For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxxwith "help" in the subject.

Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.

Follow-Ups:
- Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
  - From: Rob Janssen

References:
- [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
  - From: Stephen Satchell
- Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
  - From: Bill Unruh
- Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
  - From: Stephen Satchell
- Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
  - From: Bill Unruh
- Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
  - From: Rob Janssen

Messages sorted by: [ date | thread ]
Prev by Date: Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
Next by Date: Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
Previous by thread: Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
Next by thread: Re: [chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/