[chrony-users] interpolation of offsets in chrony - a robust approach

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]


While I was reading the docs I came across these parameters:

maxsamples [samples]

    The maxsamples directive sets the default maximum number of samples that chronyd should keep for each source. This setting can be overridden for individual sources in the server and refclock directives. The default value is 0, which disables the configurable limit. The useful range is 4 to 64.

    As a special case, setting maxsamples to 1 disables frequency tracking in order to make the sources immediately selectable with only one sample. This can be useful when chronyd is started with the -q or -Q option.

 

minsamples [samples]

    The minsamples directive sets the default minimum number of samples that chronyd should keep for each source. This setting can be overridden for individual sources in the server and refclock directives. The default value is 6. The useful range is 4 to 64.

    Forcing chronyd to keep more samples than it would normally keep reduces noise in the estimated frequency and offset, but slows down the response to changes in the frequency and offset of the clock. The offsets in the tracking and sourcestats reports (and the tracking.log and statistics.log files) may be smaller than the actual offsets.

 

Maybe I am way off here, but the descriptions suggest that these retained samples are interpolated using a linear or other form, and then the interpolated info is used by chrony. Is that correct?

 

The offset data is obviously noisy. In addition I have observed on my own machines that there can be occasional outliers that are on the order of 10x larger than usual. So the data also has outliers.

 

A linear regression is not the best way to process this kind of data. Instead a robust analysis method is best. There is a simple and effective one for obtaining the “best fit” slope of a dataset called a Thiel-Sen estimator. There is a great Wikipedia entry for it if you are not familiar with the technique (not sure if links are allowed so I did not include it). In a nutshell, the slope for all pairs of points in the dataset is computed and the median value is selected as the estimate of the slope. It is straightforward to use this to obtain an good estimate of the true offset for any time within the time interval of the dataset, and to make a prediction into the future. Because it can reject outliers and fits noisy data well, it seems like it would be a perfect candidate for a more robust offset estimator in chrony.

 

Normally this is termed an order N^2 difficult problem, because the slope must be calculated for all pairs in the dataset. But to implement this in chrony it seems to me you only need to compute N pairs as each new offset is obtained. This is because the previous pairwise slope values will not change, and it is only the pairwise slope between the single new offset value and the existing, retained values that needs to be calculated. So the overhead would not be large, especially since the number of data points is less than e.g. 64.

 

Would it be worth looking into implementing this estimation method in chrony for predicting the current and future offsets?

 

 

-Charlie



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/