Re: [chrony-dev] 1.24-pre1-- sudden change in rtc slope

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]


On Tue, 22 Dec 2009, Miroslav Lichvar wrote:

On Mon, Dec 21, 2009 at 01:19:24PM -0800, Bill Unruh wrote:
Strange behaviour of the rtc slope with the new version of chronyi (1.24pre1). I have
instituted it across a number of machines, and on many of them, suddenly the
rtc slope has changed hugely. For example on one machine, with the old chrony
1.23 the rtc slope was 7.5, it is now 206. PPM Nothing was done to the system
except change the version of the chrony that was running.

Is the slope stable or just much more noisy? I would find it very
strange if the new chrony caused RTC to run suddenly slower or faster.

I will keep investigating to make sure it is not operator error.



Probably the only related change between 1.23 and 1.24-pre1 is the
commit 9c9530, you can try revert it and see if it changes back.

On the machines where I have enabled rtc tracking I saw no such
changes in the slope. The only problem I'm seeing is the HPET
emulation.

A machine without HPET running 1.24-pre1:

  10.340139 1       10.340180        16.346  16   7  240
  10.344105 1       10.344135        16.346  17   7  240
  10.347974 1       10.347976        16.331  17   7  240
  10.351923 1       10.351929        16.331  17   7  240
  10.355886 1       10.355886        16.331  18   8  240
  10.359840 1       10.359840        16.332  19   8  240
  10.363786 1       10.363790        16.331  20   9  240

A machine with HPET and synced to PPS refclock:

  -2.089519 1       -2.087215         3.294  12   6  120
  -2.091335 1       -2.089193         2.106  13   6  120
  -2.092934 1       -2.089439         0.622  13   5  120
  -2.094780 1       -2.094780       -14.679   8   4  120
  -2.097859 1       -2.095693       -14.683   9   5   60
  -2.099156 1       -2.096917       -15.169  10   5   60
  -2.105748 1       -2.101276       -18.610  11   5  120
  -2.092038 1       -2.100617       -15.167  12   6  120
  -2.093754 1       -2.101713       -14.246  13   6  120
  -2.095239 1       -2.095196        -0.680   9   4  120
  -2.098260 1       -2.097954        -4.709   9   4   60
  -2.099887 1       -2.095276         3.832   9   4   60

I think this means it's not just a noise added by the RTC emulation,
but something with a pattern which the linear regression fails to
smooth out. Ideally the code would be able to detect it and try to use
more points in the regression or use longer intervals between
measurements.

Is that with the usual 4 min between refclock readings? You can see it in the
"fast" readings. Curnow decreases the number of points used in the regression
( not increases) if the clock is found to have biases (ie the number of zero
transitions is less than one would expect statistically), on the assumption
that the clock's rate has changed. Ie, he tries to follow the rate
fluctuations, not smooth them out. Whether that is the best idea for the rtc
is not clear. Actually the whole rtc philosophy is suspect. He tries to make
an accurate measurement of the rate of the rtc, but of course with the machine
switched off, the rtc is much colder than when on, and thus at a very
different rate. I suspect that the rate following of the rtc in chrony is
useful for rate corrections between turnons of the computer to no better than
10PPM.



Ideas?



--
William G. Unruh   |  Canadian Institute for|     Tel: +1(604)822-3273
Physics&Astronomy  |     Advanced Research  |     Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology |     unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1     |      and Gravity       |  www.theory.physics.ubc.ca/

---
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/