[chrony-users] Two Issues

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]


Hi Folks --

New subscriber here.  This is my first post to the list.

I've been using chrony for more years than I can remember. And for most of that time, it has worked flawlessly. However, after shuffling some hardware about over the past few months, I've come up against two problems. One of them may not be specifically a chrony problem, per se; but at the least, that is where I see the effects.

There are several systems on my SOHO LAN; but two of them in particular are of interest here:

The first is a simple fileserver, running Debian Stable (v7.8 / i386), with all current updates. The installed version of chrony is 1.24-3.1+deb7u2. Basic hardware is an Asus A7N8X-E Deluxe, with an AMD Athlon XP processor and 1 GB of RAM.

The second system is a desktop workstation, running Kubuntu 14.04.2 LTS (AMD-64), with all current updates. The installed version of chrony is 1.29-1. Basic hardware is a Dell OptiPlex 960 (Intel Core2 Duo w/ 4GB RAM).

The same problems seem to afflict both systems; but I am primarily focused on the server, as that is where the more problematic effects are seen.


Problem #1:

Upon re-booting either system, chrony remains "offline", with no hosts listed in response to a "chronyc sources -v" command. This is despite the fact that both systems are on a full-time (DSL) internet connection. I strongly suspect that this is due to some sort of race condition, where the chronyd daemon is being started before the host has fully established a working network connection (which is done via DHCP to the LAN's router/firewall). A semi-reliable -- but *VERY* temporary -- "fix" for this is to open a console, log into the system as root (via an SSH connection in the case of the server), then manually issue the following command:

    invoke-rc.d chrony restart

At which point, chrony starts running more-or-less normally.

The trouble with this (besides the obvious inconvenience) is that if I'm not around to manually issue that command immediately after the re-boot (such as after a power-failure-initiated automatic shutdown/restart sequence; note that I also run "apcupsd" on both systems), or if I simply forget to do so (which also happens with depressing regularity <~>), the system clock can become WILDLY "off", and recovery can take halfway to forever if chrony does NOT automatically decide to do a "burst" process (a decision which seems to depend on the flip of a coin). And no, I cannot manually force the "burst" process, for reasons explained below under "Problem #2".

Just such a situation occurred recently, which was possibly further exacerbated by the fact that we just went through the Daylight Savings Time transition. The end result is, the system clock on the server is "out" by something like 8-1/2 hours. Yes, based on the "System Time" reports from the "chronyc tracking" command, it is S-L-O-W-L-Y correcting itself; but at the current snail's pace rate, it will take something like a week to fully catch up. (BTW... This leads to a "side-issue" question, which I'll get to below.)

So...  If someone can give me a pointer on how to either:

A. - Delay the point in the start-up process where chronyd is invoked, until after the network connection is fully sorted out, OR...

B. - Modify the chronyd start-up process so that it initially starts "offline", but then after a short delay (say, one minute) re-initiates itself in "online" mode (and hopefully finds all the defined NTP sources, etc.).

That would be VERY helpful.

Solution "A" seems simplest, at least at first blush; but something I read in the list archives (while unsuccessfully searching for mentions of problems similar to mine) suggested that there are some arcane (but important) reasons to start chronyd as early in the boot process as possible. So... I dunno.


Problem #2:

In an effort to speed up the "recovery" process from such gross errors, I've tried to manually issue certain chronyc commands (such as "offline", "online", "burst"). But each time, I get only "501 Not authorized" in response. I've tried entering the chrony password, as contained in the chrony.keys file, but to no avail. Now part of the problem MAY be that I am not entering the password correctly (i.e., in a manner that chrony recognizes; I'm confident that I *am* getting the password itself right), because the instructions on using the password command are anything but clear (at least to me <~>). When I enter "chronyc password abcxyz" (where "abcxyz" is the actual password; obviously, I've faked it here) on the command line, I *DO* get a "200 OK" response. But subsequent efforts to use commands requiring authentication still fail with a "501 Not authorized" message.

So, I'm more-or-less stumped on that one.


Now, for that "side issue"... Is there a way (perhaps via a parameter in "chrony.conf"?) to make the automatic self-correction process happen at a significantly faster rate, even when "burst" or "makestep" are NOT being used? As noted above, the way things are now, my server's System Clock won't be "right" for upwards of a week. The "chronyc tracking" command is showing that the "slewing" rate has only been "fudged" by an extremely small margin ("Frequency: 33.515 ppm fast"). Now, I'm certainly no clock/timekeeping expert; but I nonetheless have a hard time imagining how things could be seriously fouled up if that rate were perhaps ten (or even a hundred) times higher.

I would think that, ideally, the self-correction rate would be (at least loosely) a function of the current error magnitude. For errors of only a few minutes or so, such very slow correction rates are not much of a practical problem, because the clock will still be brought back to "correct" fairly soon. But when there is a several-hour discrepancy, then a bigger hammer is needed. FWIW, and all that.

Any help on any of these issues will be appreciated.


--

Weary1

--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject. For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/