Re: [chrony-users] High skew values

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]

To: chrony-users@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [chrony-users] High skew values
From: Bill Unruh <unruh@xxxxxxxxxxxxxx>
Date: Sat, 27 Jul 2013 14:50:42 -0700 (PDT)

On Sat, 27 Jul 2013, Arnon Weinberg wrote:


This is now solved (see below).


Acrually you did not tell us how you solved it. Or was it just yelling at the
provider to they put you onto a better virtual machine.

On 2013-07-26 18:59, Bill Unruh wrote:
 IF you can figure out what the "average" drift is, you could use adjtimex
 to
 adjust the system clock's rate to take that out.
No, I can't. As you correctly point out below, this is impossible for such ahigh drift.
adjtimex --tick=13000
adjtimex: Invalid argument
for this kernel:
   USER_HZ = 100 (nominally 100 ticks per second)
   9000 <= tick <= 11000
   -32768000 <= frequency <= 32768000
and indeed the system log does occasionally include:
chronyd[463]: Required tick 13194 outside allowed range (9000 .. 11000)
> What I don't understand is this: chrony logs the following in> /var/log/messages:> > chronyd[490]: System clock wrong by 15.124741 seconds, adjustment> > started> It does this (saying the clock is wrong by about 5-20s) even when the> clock is wrong by hours.
 I think that this is the "least squares" offset.
Alright, that's very different from what I thought it was.
 It sends out a packet with a local time stamp. The remote server,
 timestamps
 the packet when it is received and when it is sent out again, and your
 machine
 timestamps it when it comes back. The measured offset is the difference
 betweeen the means of the local timestamps and the remote timestamps.
 chrony
 then takes the last N offsets (compensated for changes it has made in the
 drift rate of the clock) and does a least squares fit to find out what the
 best estimate is for the drift error and offset error. It also tests to
 see if
 the deviations from the least squares fit look roughly random. If not, it
 makes N smaller and tries again until N is 3. In your system N seems to
 hang
 around 3 a lot.
Thank you Bill for a *very* clear explanation. I think I finally understandwhat you meant earlier - this system has 2 problems: Very uneven drift andvery high drift. The uneven drift causes the "Can't synchronise: nomajority" errors, and the high drift causes the "Required tick outsideallowed range" errors. So chrony cannot set an accurate adjustment nor aquick enough adjustment to compensate.
 That is beyond the ability of chrony (or anything) to correct. The max
 drift
 rate that can be compensated is 6 sec/minute. Jumping the clock is your
 onlyoption.
If I was going to live with this system as-is, then you would be right.
For anyone else reading this: An easy way to diagnose a sick machine is touse something like:
adjtimex -c=10 -i=10
                                     --- current ---   -- suggested --
cmos time     system-cmos  error_ppm   tick      freq    tick      freq
1374906355      -0.660995
1374906367      -3.003384  -234238.9  11000         0
1374906379      -5.041906  -203852.2  11000         0   13038   3421012
1374906390      -6.129622  -108771.6  11000         0   12087   4691487
1374906402      -8.383569  -225394.7  11000         0   13253   6206387
1374906415     -11.768661  -338509.2  11000         0   14385    603062
1374906428     -15.120310  -335164.9  11000         0   14351   4253587
1374906440     -17.387367  -226705.7  11000         0   13267    373175
1374906453     -20.810277  -342291.0  11000         0   14422   5963612
1374906465     -23.090249  -227997.2  11000         0   13279   6370600
if those suggested "tick" values on the right are >11000 (ie, drift >6s perminute), then the timer is too broken for chrony to fix.
So when Bill tells you that your machine is very sick, listen to him. :-)
This is a "hardware" issue (in the case of a virtual machine, something moreelaborate) that needs to be fixed - in my case, by the hosting serviceprovider.
Also, for the record, although virtual machines do suffer from much moredrift problems than physical machines, there is a difference between an"inaccurate" clock (typical of virtual machines), and a "broken" clock (notso typical). Most virtual machines are not "broken" and chrony works justfine.

--

To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxxwith "unsubscribe" in the subject.For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxxwith "help" in the subject.

Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.

Follow-Ups:
- Re: [chrony-users] High skew values
  - From: Arnon Weinberg

References:
- [chrony-users] High skew values
  - From: Arnon Weinberg
- Re: [chrony-users] High skew values
  - From: Bill Unruh
- Re: [chrony-users] High skew values
  - From: Arnon Weinberg
- Re: [chrony-users] High skew values
  - From: Bill Unruh
- Re: [chrony-users] High skew values
  - From: Arnon Weinberg
- Re: [chrony-users] High skew values
  - From: Bill Unruh
- Re: [chrony-users] High skew values
  - From: Arnon Weinberg

Messages sorted by: [ date | thread ]
Prev by Date: Re: [chrony-users] High skew values
Next by Date: Re: [chrony-users] High skew values
Previous by thread: Re: [chrony-users] High skew values
Next by thread: Re: [chrony-users] High skew values

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/