RE: [chrony-users] kernel PPS troubleshooting

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]


Hi All,
Sorry for the delayed response.  I have collected 36 hours of data with the following sources:
refclock PPS /dev/pps1 refid PPSi
refclock SHM 2 offset 0.530 delay 0.01 refid GPSi
server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect
server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect
server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect

Since we will be running disconnected from real NTP servers in our application I had the 3 NTP servers as noselect so that I could track the GPS and PPS against them, but not actually use them in the selection algorithm.

> Miroslav said:
>If a source disappears for 8 polling intervals, chronyd will select another source even if it's much worse. I agree that could be improved. With NMEA sources it's usually better to use the noselect option or don't configure it at all.
Since we will not have access to a network time source and will be relying on GPSD/NMEA to get us in the correct ballpark on system startup, is there another configuration option we can try to minimize the snapping back to GPS so quickly?

The three attached plots are:
4hr_offsets:  Hours 0-4, offsets straight from statistics.log
4hr_offsets_PPSadjusted:  Hours 0-4, adjusted offsets assuming PPS was always 0 and using the most recent PPS value to adjust the actual offset in statistics.log
Syncsource_PPSadjusted:   Hours 2-4, same data as PPSadjusted but with background highlighted according to active sync source from tracking.log

Looking through the refclocks.log it seems as though even with both PPS and GPS present and having samples filtered, often after a GPS filtered entry in the log PPS samples would be dropped completely until one or more subsequent GPS filtered entries.
{14 GPSi samples and 14 PPSi samples}
2013-11-27 23:08:38.999883 PPSi   15 N 1  2.455370e-04  1.161940e-04  2.265e-04
2013-11-27 23:08:36.999489 PPSi    - N -       -        5.107210e-04  1.854e-04
2013-11-27 23:08:39.600949 GPSi   15 N 0 -6.007421e-01 -7.094921e-02  2.206e-02
2013-11-27 23:08:33.249250 GPSi    - N -       -       -1.925024e-02  6.892e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:08:55.532367 GPSi   15 N 0 -5.323654e-01 -2.367523e-03  2.179e-02
2013-11-27 23:08:46.365687 GPSi    - N -       -       -3.568759e-02  7.070e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:09:43.590657 GPSi   15 N 0 -5.906571e-01 -6.065711e-02  2.146e-02
2013-11-27 23:09:37.901101 GPSi    - N -       -       -7.110153e-02  6.716e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:10:00.489102 GPSi   15 N 0 -4.891029e-01  4.089708e-02  2.124e-02
2013-11-27 23:09:52.357123 GPSi    - N -       -       -2.712306e-02  6.472e-03
2013-11-27 23:10:00.000461 PPSi    0 N 1 -5.952060e-04 -4.616970e-04  1.896e-04
2013-11-27 23:10:01.561675 GPSi    0 N 0 -5.618044e-01 -3.167506e-02  2.047e-02
{14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping PPS samples again}

I have the full console output as well with debugging enabled and am trying to figure out how best to parse and analyze it.  One thing I notices in comparison to my previous run is that all of the ignored PPS samples are coming from line 465 in refclock.c:
refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored second=0.999999657 sync=0 dist=1.500000000
and not line 440 like they were on the previous run:
refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546

Thanks,
Scott

-----Original Message-----
From: Bill Unruh [mailto:unruh@xxxxxxxxxxxxxx] 
Sent: Friday, November 29, 2013 11:48 AM
To: chrony-users@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [chrony-users] kernel PPS troubleshooting

On Fri, 29 Nov 2013, Miroslav Lichvar wrote:

> On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote:
>> On Fri, 29 Nov 2013, Bill Unruh wrote:
>> By the way, does the kernel PPS do median filtering before passing on 
>> the times to chrony? (Ie, taking the median of say the past 16 inputs 
>> and throwing away the 6 worst outliers and then retaking the median?)
>
> The kernel doesn't filter the PPS samples in any way. In chronyd the 
> PPS driver fetches the latest PPS sample from the kernel once per 
> second and the refclock poll (16 seconds by default) runs the median 
> filter.

Ah. OK.

>
>> Anyway, it should not be switching sources unless the deviation of 
>> the selected source exceeds the variance of the alternative (or 
>> unless the source has disappeared for a suitable number of poll 
>> intervals, probably related to how long one would expect to wait for 
>> the drift rate variance to make the system clock deviate by more than 
>> the second source's variance. Ie, you are far better off letting a 
>> clock drift unconstrained for a while than to jump to source which has a huge (factors of a 1000) worse variance.
>
> The selection algorithm prefers sources with shortest distance (with 
> refclock that's the measured dispersion + configured delay). If there 
> are more sources with similar distance they will be combined together.
>
> If a source disappears for 8 polling intervals, chronyd will select 
> another source even if it's much worse. I agree that could be 
> improved. With NMEA sources it's usually better to use the noselect 
> option or don't configure it at all.

It looks in the source code as if it grabs a new source as soon as the source disappears, but that was really not a very good look I had at the code.

If only only had nmea and pps, one needs the nmea at least at start up to get the time to within a half second or so, but thereafter of course it probably should not be used unless the PPS disappears for quite a while ( in which case the nmea is liable to be not very good either)

Certainly it would be good to find out what was happening with his clock hopping.

>
>


--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "unsubscribe" in the subject.
For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.

Attachment: 4hr_offsets.png
Description: 4hr_offsets.png

Attachment: 4hr_offsets_PPSadjusted.png
Description: 4hr_offsets_PPSadjusted.png

Attachment: syncsource_PPSadjusted.png
Description: syncsource_PPSadjusted.png



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/