Re: [chrony-dev] Chrony freezing

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]


On Thu, 12 Nov 2009, Miroslav Lichvar wrote:

On Wed, Nov 11, 2009 at 12:28:32PM -0800, Bill Unruh wrote:
rtc_linux.c:877 is exactly that second read from the rtc which I have had
trouble with in the past hanging.

It looks like the rtc device stopped providing data, possibly a kernel
bug.

The problem is the routing of the interupt that rtc is supposed to deliver on
the rollover of the second. The routing of that has become very confused with
the new hpet. I have opened bugs with the kernel people. In the meantime,
chrony needs to be set up so that that bug does not freeze chrony. I guess a
second select() before that second read() is needed to make sure that that
does not hang chrony.


Thus, this section seems to be suffering from a cleft stick. If one only does
a single read from the rtc, it interrupts immediately instead of on the second
as it should on most systems. If I do the double read, it hangs on sometimes
on some systems (this particular hang seems to be occuring after I do a
chronyc cyclelogs, but only sometimes, not always. Ie, this problem with the
rtc seems to be sporadic and flakey)

Hm, I don't think there should be any double reading. Here, the second
read makes chrony unresponsive until the next second as the read
blocks. That's not good.

The problem is that when the UIE flag goes on in the driver for rtc, it
immediately unblocks-- not waiting for the next seconds rollover, and opens
opens the file for reading. Chrony assumes that at this unblocking the time is
at the rollover. In order to get around this bug, I put in the second read.
However, it seems that sometimes that second read blocks forever due to some
other kernel bug. Putting in a select() for the second read with a 1 second timeout might be a
way of getting around both kernel bugs.



Is it possible to do a timeout on a read so that if it has not returned in a
second say, that read is abandoned?

It's better to not use any blocking reads, read only when select()
says it's ready.

Yup.


Also it would probably be a good idea to put a nortc
keyword into /etc/chrony.conf, so that if one has one of these flakey systems,
once can switch off all rtc use for that system.

Removing rtcfile command will disable using rtc.

Ideally, getting the kernel people to fix rtc even with the hpet system, would
be a good idea. (the problem is that under hpet, the rtc interrupt is routed
to the non-maskable interrupt I believe, and it seems that is difficult to use for the
rtc.)
However it may be a while and chrony is still being left as flakey on the
older kernels.

Is it better on recent kernels?

They have been working on the whole rtc thing but I am not sure what the
situation is.




--
William G. Unruh   |  Canadian Institute for|     Tel: +1(604)822-3273
Physics&Astronomy  |     Advanced Research  |     Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology |     unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1     |      and Gravity       |  www.theory.physics.ubc.ca/

---
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/