Re: [chrony-users] Silent Failure -- Enhancement Request

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]


> (..just in case anybody else using Prometheus is reading this :)

Not related to the original request, but related to Prometheus monitoring of time synchronization in general:
We took a low level approach that catches a very broad range of problems by monitoring the node_timex_frequency_adjustment_ratio from the standard Prometheus node exporter. If there is no change to this value for a certain period (90 minutes in our case), there is nobody governing the kernel clock. This not only catches everything from chrony being down to servers unavailable, but it also works out of the box for other time synchronization systems like plain old ntpd.
When this alert goes off, it's usually pretty straight-forward to find the root case.

Regards,
  Carsten
--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "unsubscribe" in the subject.
For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/