Re: [chrony-users] Two Issues

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]




William G. Unruh   |  Canadian Institute for|     Tel: +1(604)822-3273
Physics&Astronomy  |     Advanced Research  |     Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology |     unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1     |      and Gravity       |  www.theory.physics.ubc.ca/

On Wed, 11 Mar 2015, Weary1 wrote:

Hi Folks --

New subscriber here.  This is my first post to the list.

I've been using chrony for more years than I can remember. And for most of that time, it has worked flawlessly. However, after shuffling some hardware about over the past few months, I've come up against two problems. One of them may not be specifically a chrony problem, per se; but at the least, that is where I see the effects.

There are several systems on my SOHO LAN; but two of them in particular are of interest here:

The first is a simple fileserver, running Debian Stable (v7.8 / i386), with all current updates. The installed version of chrony is 1.24-3.1+deb7u2. Basic hardware is an Asus A7N8X-E Deluxe, with an AMD Athlon XP processor and 1 GB of RAM.

The second system is a desktop workstation, running Kubuntu 14.04.2 LTS (AMD-64), with all current updates. The installed version of chrony is 1.29-1. Basic hardware is a Dell OptiPlex 960 (Intel Core2 Duo w/ 4GB RAM).

The same problems seem to afflict both systems; but I am primarily focused on the server, as that is where the more problematic effects are seen.


Problem #1:

Upon re-booting either system, chrony remains "offline", with no hosts listed in response to a "chronyc sources -v" command. This is despite the fact that both systems are on a full-time (DSL) internet connection. I strongly suspect that this is due to some sort of race condition, where the chronyd daemon is being started before the host has fully established a working network connection (which is done via DHCP to the LAN's router/firewall). A semi-reliable -- but *VERY* temporary -- "fix" for this is to open a console, log into the system as root (via an SSH connection in the case of the server), then manually issue the following command:

   invoke-rc.d chrony restart

At which point, chrony starts running more-or-less normally.

You could also put a command into say rc.local to do the same thing. I am not sure of the Debian startup script, and whether you can easily make
sure that chrony starts after the network has come up. Under systemd you can
put in a dependency which says taht chrony is supposed to start after the
network. But your probably do not have systemd, and how to create such a
dependency I do not know.

Note also that you do not have to restart chrony, you could also issue
commands via chronyc to tell it to put the sources online. But restarting it
would work.


The trouble with this (besides the obvious inconvenience) is that if I'm not around to manually issue that command immediately after the re-boot (such as after a power-failure-initiated automatic shutdown/restart sequence; note that I also run "apcupsd" on both systems), or if I simply forget to do so (which also happens with depressing regularity <~>), the system clock can become WILDLY "off", and recovery can take halfway to forever if chrony does NOT automatically decide to do a "burst" process (a decision which seems to depend on the flip of a coin). And no, I cannot manually force the "burst" process, for reasons explained below under "Problem #2".

As I said, put it into a script and put that into for example rc.local to be
run automatically.


Just such a situation occurred recently, which was possibly further exacerbated by the fact that we just went through the Daylight Savings Time transition. The end result is, the system clock on the server is "out" by something like 8-1/2 hours. Yes, based on the "System Time" reports from the

How in the world did that happen? That cannot be due to any clock drift,
because at even 100PPM. that would take 10 years to accumulate.

"chronyc tracking" command, it is S-L-O-W-L-Y correcting itself; but at the

Look at the initstepslew directive. This tells chrony to initially step the
clock if it is too far off, and only slew it if it is off by some amount you
configure.


current snail's pace rate, it will take something like a week to fully catch up. (BTW... This leads to a "side-issue" question, which I'll get to below.)

So...  If someone can give me a pointer on how to either:

A. - Delay the point in the start-up process where chronyd is invoked, until after the network connection is fully sorted out, OR...

B. - Modify the chronyd start-up process so that it initially starts "offline", but then after a short delay (say, one minute) re-initiates itself in "online" mode (and hopefully finds all the defined NTP sources, etc.).

That would be VERY helpful.

Solution "A" seems simplest, at least at first blush; but something I read in the list archives (while unsuccessfully searching for mentions of problems similar to mine) suggested that there are some arcane (but important) reasons to start chronyd as early in the boot process as possible. So... I dunno.


Problem #2:

In an effort to speed up the "recovery" process from such gross errors, I've tried to manually issue certain chronyc commands (such as "offline", "online", "burst"). But each time, I get only "501 Not authorized" in response. I've tried entering the chrony password, as contained in the chrony.keys file, but to no avail. Now part of the problem MAY be that I am

You have to make sure that a reference to the key number is in chrony.conf and
that key number has the password in the keys file.

not entering the password correctly (i.e., in a manner that chrony recognizes; I'm confident that I *am* getting the password itself right), because the instructions on using the password command are anything but clear (at least to me <~>). When I enter "chronyc password abcxyz" (where "abcxyz" is the actual password; obviously, I've faked it here) on the command line, I *DO* get a "200 OK" response. But subsequent efforts to use commands requiring authentication still fail with a "501 Not authorized" message.

That would just run chrony and run the password command and exit, and it would
forget that you entered the password.

chronyc
chronyc> password abcxyz
200 OK
chronyc> Other commands
....
chronyc> quit


So, I'm more-or-less stumped on that one.


Now, for that "side issue"... Is there a way (perhaps via a parameter in "chrony.conf"?) to make the automatic self-correction process happen at a significantly faster rate, even when "burst" or "makestep" are NOT being used? As noted above, the way things are now, my server's System Clock won't be "right" for upwards of a week. The "chronyc tracking" command is showing that the "slewing" rate has only been "fudged" by an extremely small margin ("Frequency: 33.515 ppm fast"). Now, I'm certainly no clock/timekeeping expert; but I nonetheless have a hard time imagining how things could be seriously fouled up if that rate were perhaps ten (or even a hundred) times higher.

Uh, chronyc does speed up the slewing to very high rates. But if the clock is
off by 8 hours it is silly to try to slew it intothe correct time. Use the
initstepslew of makestep commands to get it into time quickly.


I would think that, ideally, the self-correction rate would be (at least loosely) a function of the current error magnitude. For errors of only a few minutes or so, such very slow correction rates are not much of a practical problem, because the clock will still be brought back to "correct" fairly soon. But when there is a several-hour discrepancy, then a bigger hammer is needed. FWIW, and all that.

It is.

Note that nptd has a max slew rate of 500PPM, steps if the clock is out by
128ms, and gives up (quits) if off by something like an hour.



Any help on any of these issues will be appreciated.


--

Weary1

--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject. For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject. For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/