|Re: [chrony-dev] Patch: avoid infinite select() loop on chronyc|
[ Thread Index |
| More chrony.tuxfamily.org/chrony-dev Archives
On Thu, Nov 30, 2017 at 09:48:29PM +0000, Gafton, Cristian wrote:
> When chronyc attempts to chat over the control socket with chronyd, if for whatever reason the first communication attempt fails, chronyc will end up in a permanent loop trying and failing to communicate. This is because in the submit_request() function the select() loop break-exit failsafes are protected by a new_attempt flag which is cleared on the first try and never updated again. The attached patch attampts one way of fixing this..
Interesting. I assume this is not easily reproducible and happens only
rarely. When the server doesn't respond, select() should timeout,
which should increase the number of attempts and eventually exit the
The n_attempts variable is supposed to count the number of requests
that were sent to the server, not the number of select() calls as the
patch suggests to do. If a spurious packet is received, it should not
I suspect the real problem is with the timeout. If it was negative,
select() would return EINVAL and the loop would not terminate. I guess
this could happen if chronyd (or something else) stepped the system
clock forward right between the first gettimeofday() call before the
request is sent and the second gettimeofday() call before select().
Could that be what happened in your case?
I think the fix should be to check the timeout after its calculation.
If it is negative, a new attempt should be made.
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble? Email listmaster@xxxxxxxxxxxxxxxxxxxx.