Re: [chrony-users] Request: Add an optional timeout option for 'chronyd -q ...'

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]


The problem with stress testing is it can become like throwing a beer bottle
at a concrete wall. It was never intended to survive.
But I guess you could argue that chrony should not freeze if it gets a KoD. Hoeever you should never be running it so you get a KoD. It is an extreme
response of a remote system to egregious behavior. One could argue that the
right thing for chrony to do if it gets a KoD is to crash, because the
operator has badly misused it.

William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/

On Wed, 30 Nov 2016, Lonnie Abelbeck wrote:

Hi Bill,

Thanks for your comments.

I was stress testing, basically running these pair of commands in a loop every 10 seconds.
--
chronyd -q -u ntp "server $first iburst"
...
chronyd -u ntp
--

I take it this is an "unusual" situation, but if a hang can happen, it will happen when you don't want it to.

BTW, if I do this with a pre-populated /etc/chrony.conf
--
chronyd -q -u ntp
...
chronyd -u ntp
--
The hangs do not occur since multiple sources are always involved, but it takes 6-7 seconds to complete instead of 4 with a single server, for reference sntp does it in about 2 seconds.

Lonnie

BTW, my office NTP server (aggregates 2 external Stratum 1 and a local GPS network server) is an old Soekris net5501 (AMD Geode LX) and chrony results in much less jitter than ntpd did, very nice !




On Nov 30, 2016, at 8:07 PM, Bill Unruh <unruh@xxxxxxxxxxxxxx> wrote:


On Wed, 30 Nov 2016, Lonnie Abelbeck wrote:

HI,

We are in the process of moving from 'ntp' to 'chrony' for our open source project.

In a matter of a few hours, I have made the conversion, including testing by booting without a network connection, restart chrony every 10 seconds for a 100+ times, etc .

I am very impressed with 'chrony' !

Though we have one issue uncovered during testing.

First, a little background, previously we used this sequence at boot time to set a big jump, then maintain the clock:
----
first="$(awk '/^server / { print $2; nextfile; }' /etc/ntpd.conf)"
...
sntp -S -t4 $first
...
ntpd -g -c /etc/ntpd.conf
----

This is how we are doing it now with chrony:
----
first="$(awk '/^(server|pool) / { print $2; nextfile; }' /etc/chrony.conf)"
...
chronyd -q -u ntp "server $first iburst"
...
chronyd -u ntp
----

What we would like is a "-t timeout" option to be used with "chronyd -q ..." just like sntp's -t option.  I have seen cases where "chronyd -q ..." hangs without a timeout.

The last thing we want to for chronyd to hang at boot time, and it can without such a timeout option, here are some logs...

2016-11-30T18:50:24Z chronyd version 2.4.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER -SECHASH +ASYNCDNS -IPV6 -DEBUG)
2016-11-30T18:50:24Z Initial frequency -17.442 ppm
2016-11-30T18:50:26Z Received KoD RATE from 138.68.46.177, burst sampling stopped
(Hung for minutes or longer, a ^C was require to continue)

What are you doing to get a KoD? The intial burst should only be very brief
and with something like 3 samples. More is not really needed.



2016-11-30T18:43:30Z chronyd version 2.4.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER -SECHASH +ASYNCDNS -IPV6 -DEBUG)
2016-11-30T18:43:30Z Initial frequency -15.827 ppm
2016-11-30T18:43:37Z Received KoD RATE from 67.4.147.175, burst sampling stopped
2016-11-30T18:47:57Z No suitable source for synchronisation
(Worked)

2016-11-30T19:11:01Z chronyd version 2.4.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER -SECHASH +ASYNCDNS -IPV6 -DEBUG)
2016-11-30T19:11:01Z Initial frequency -17.331 ppm
2016-11-30T19:11:01Z Received KoD RATE from 50.116.52.97, burst sampling stopped
(Hung for minutes or longer, a ^C was require to continue)

All of the above are linked to KoD. The remote system is objecting strenuously
to the rate at which you are hitting them for sample.


Does adding a "-t timeout" option to be used with "chronyd -q ..." sound reasonable ?  We would probably use -t 8 (in seconds) for the upper bound.

Thanks,
Lonnie




--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "unsubscribe" in the subject.
For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject. For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/