Re: [chrony-dev] PPSAPI: kernel consumer

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]


В Mon, 8 Feb 2010 09:16:48 -0800 (PST)
Bill Unruh <unruh@xxxxxxxxxxxxxx> пишет:

> On Mon, 8 Feb 2010, Alexander Gordeev wrote:
> 
> ...
> >
> > Simply because chrony is good. And this small activity is still very
> > important. In my case we run distributed simulation and we need
> > clocks of all involved computers to be synchronised very tightly.
> > All the time. In this scheme there is a master which generates PPS
> > signals and also runs ntpd to provide it's local time to other
> > computers. But master's local clock slowly diverges from the
> > astronomical time. We can only sync it between experiments. And we
> > want it to happen fast. ntpd can handle this but way too slow. I'm
> > sure, chronyd will do much better.
> >
> > Also ntpd is completely unusable without the help of kernel
> > consumer in our setup. But chrony can be used and it doesn't need
> > any special support from the kernel. This is a huge advantage. And
> > it would be great if we don't have to configure two different time
> > daemons (BTW their packages are conflicting in Debian).
> >
> > If you consider adding this feature I can assure you we'll test it
> > thoroughly.
> 
> I think I would strongly advise against adding this feature. It is
> complex, it does not, if I understand things, fit with the chrony
> philosopy of operation at all, it would make the code far more
> fragile.

I can't imagine how can it possibly break things. Until I dive into
chrony's code maybe. But I'm not sure that it can happen soon.

> >>> I like chrony very much because it can do its job really
> >>> good. But the kernel implementation is so very simple and
> >>> straightforward and works good as well. And this is a part of
> >>> PPSAPI. I mean that it would be great to have choice.
> 
> I do not understand why you want "simple and straightforward" since
> others have already written chrony for you. It is simple and
> straightforard. Just run chronyd.

I've actually written and even published this code about a month before
the PPS refclock driver was added to chrony:
http://ml.enneenne.com/pipermail/linuxpps/2009-October/003324.html
I refactored and tested it a lot before I decided that it's ready for
mainline inclusion.

Now about "simple and straightforward": you can see from the patch
http://marc.info/?l=linux-kernel&m=126523128109690&w=2
that the code that handles synchronization takes about 200 lines of
code (including comments). It is really very simple. hardpps() takes
two parameters, first is directly converted to phase compensation,
second is directly converted to frequency compensation (but frequency
calibration interval can differ but these are details). It is all! It is
very simple and thus predictable. We need code to be predictable because
we run a hard real-time environment for distributed simulation.
chrony is surely more complex. Also my code outperforms chronyd in
convergence speed at least. But please do not think that I want to lower
value of chrony.

> >>
> >> I haven't done any testing, but I suspect it will be very
> >> sensitive to noise, especially if you have removed the median
> >> filtering.
> >
> > I don't know what kind of noise it should prevent. But in the case
> > when there is little or no noise (in our case) median filtering
> > from the original implementation was a big trouble. It can be ok
> > only if used together with exponential filtering which I removed
> > for the sake of convergence speed. :)
> 
> All measurements have noise. All. And noise from the timekeeper,
> whether GPS or on board can often occur. Watching my GPS, every once
> in a while it will send out 10 pulses in a second on the PPS line--
> problably interference. Stuff happens. The timekeeping system should
> be robust enough to handle stuff.

OK. My code is prepared against 10 pulses in a second. Next target?

> > Anyway, I'm going to look into chrony source to know how it can
> > achieve nearly the same results. And also I'm open to suggestions
> > on how to improve the code.
> >
> >> There seems to be a reason why ntpd doesn't use PPS discipline
> >> by default, see
> >> http://www.eecis.udel.edu/~mills/ntp/html/drivers/driver22.html
> >>
> >> "As the result, performance with minpoll configured at 4 (16s) is
> >> generally better than the kernel PPS discipline. However, fudge
> >> flag 3 can be used to enable the kernel PPS discipline if
> >> necessary."
> >
> > Well, maybe this can be true in the worst case but my tests show
> > that ntpd can never sync as tight as my kernel discipline.
> 
> timekeeping code should be robust enough to handle "worst case". You
> never design a system to only handle best cases well.

This is just handwaving. What then is a worst possible PPS signal that
it should handle correctly?
Please, be constructive or it looks like trolling.

> >>> I doubt that adjtimex will ever be extended. In the kernel I use
> >>> MONOTONIC_RAW clock to calculate frequency adjustments and
> >>> REALTIME clock for the phase. Seems to me this is the most
> >>> straightforward way. But AFAIK MONOTONIC_RAW is Linux-specific.
> >>
> >> It was extended not so long a go with the ADJ_OFFSET_SS_READ mode.
> >> The extensions I'm proposing don't require modification of the
> >> struct, just one or two more modes.
> >
> > Well, good luck!
> >
> >> Can the MONOTONIC_RAW time be used to determine when was a
> >> frequency change applied?
> >
> > I don't think so. But it could be used to determine frequency
> > without any previous calculation. If you have two MONOTONIC_RAW
> > timestamps for any two PPS signals you can just subtract them and
> > divide by the number of seconds between them and you've got the new
> > frequency value.
> 
> And that is very noisy. there are millions of things which could
> introduce noise into that determination. It is precisely to beat down
> that random noise that chrony uses up to 64 past measurements to
> determine the frequency, and never less than 3.

My kernel discipline measures frequency using calibration intervals
from 4 to 256 seconds. The precision is quite good so far.

> > MONOTONIC_RAW clock is completely untouched by the ntp subsystem in
> > the kernel. It's like rdtsc that was expected in the original
> > implementation but it doesn't suffer from CPU frequency changes and
> > other stuff.
> 
> How does it not suffer from CPU frequency changes?

Because it doesn't use tsc. It uses the default clocksource, lapic in
my case.

-- 
  Alexander

Attachment: signature.asc
Description: PGP signature



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/