Re: [chrony-dev] Chrony freezing

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]


On Fri, 13 Nov 2009, Miroslav Lichvar wrote:

On Thu, Nov 12, 2009 at 01:15:13PM -0600, John Hasler wrote:
Bill writes:
Putting in a select() for the second read with a 1 second timeout
might be a way of getting around both kernel bugs.

I don't see any downside to adding a select().

Where do you propose we should add it? If it is right before the
second read, it will solve the problem with blocking forever, but
chronyd will still be unresponsive for the second which I'm worried
about more as it degrades NTP performance.

This is only used for reading the rtc and for determining the rate and offset
of the rtc. Ie, it does not impact ntp performance, except insofar as it make
characterising rtc more difficult. But most systems ignore chrony's rtc
anyway, and start the system using hwclock to set the system time from the
rtc. And the newer hwclock's also have an ability to roughly determine the
rate of the rtc and offset and correct for those. Ie, the rtc performance of
chrony is to some extent becoming superfluous.


Note that read_from_device() is called only when select() says the
descriptor is ready, so adding a second select might not make the
results better.

The way the rtc driver works (supposedly) is that when a UIE interrupt occurs, it signals the driver to make the rtc available for reading once. The second read blocks until the next interrupt occurs. The problem is that when the UIE is turned on, it
immediately makes the device available for reading, it does not wait for the
second boundary. This leads to reads at random times. I put in the second
read, which was supposed to block until the next interrupt and then unblock
and read the device. Unfortunately sometimes it does not do that, and I have
no idea why. It blocks forever. This all seems to be tied in to the
alterations that were instituted in order to support the HPET, both in the
bios ( which on hpet machines reroutes the rtc interrupt to the NMI ) and in
the kernel rtc driver. It is a bit of a mess.

This eternal reblocking seems to only occur sometimes. ( Actually it could be that it
is blocking more often only that usually it unblocks eventually). Ie, the fact
that the select has returned is only valid for the first read. The second read
again blocks until the second boundary, except sometimes it does not. putting
in that second select would at least mean that chrony does not block forever (
which does terrible things for its ntp performance:-)





--
William G. Unruh   |  Canadian Institute for|     Tel: +1(604)822-3273
Physics&Astronomy  |     Advanced Research  |     Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology |     unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1     |      and Gravity       |  www.theory.physics.ubc.ca/

---
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/