Re: [chrony-dev] chronyd not recovering after time stepped. |
[ Thread Index |
Date Index
| More chrony.tuxfamily.org/chrony-dev Archives
]
- To: chrony-dev@xxxxxxxxxxxxxxxxxxxx
- Subject: Re: [chrony-dev] chronyd not recovering after time stepped.
- From: Bryan Christianson <bryan@xxxxxxxxxxxxx>
- Date: Fri, 28 Aug 2015 20:32:13 +1200
- Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=smtpcorp.com; s=a0-2; h=Feedback-ID:X-Smtpcorp-Track:To:Message-Id:Date: From:Subject; bh=eoyntnQxbm6er+Cn/M+IT+uPqgeWWMxzdZUggywp9n0=; b=K4Zq7003v7CE qTadRR7IUvvYMisa+b2chG29JIfGPXapMifN0rPOJshubgC7VoBK8wxEgCGj/0BZYbD/9moExJ8Sw umQd3+lxVNUZgMQ2TxC3T9IOOTPPSxBpZTvcG5Z21UB8DxK0+ZAv3dIbQ89E/0reCdAFx6arDChko S0Jbb2rEtNuDT9hVVvm35c5AWySwpBDOdbsJm6xOV5MrtD2n0IOCNtSHXRXBoHIsiPw6Dh2S0PI5C K+diR1H3HTkVcTcYH8gBqEpmDp2P4rfTsZ7GDVbAya0l9XFqsO6szEpuLJilD/XgW+S9dviT/VAMt VMlAQLIMoM1CQLqUzPbifA==;
- Feedback-id: 149811m:149811acx33YQ:149811sUapGmMZHF:SMTPCORP
> On 28/08/2015, at 7:57 pm, Miroslav Lichvar <mlichvar@xxxxxxxxxx> wrote:
>
> On Fri, Aug 28, 2015 at 04:49:07PM +1200, Bryan Christianson wrote:
>>> I think I figured it out. When the time is stepped there is a big spike in offset_sd. This causes the drift removal interval to be increased to a very large value (aprox 2 hours in the trace I just looked at) and then applied when the current drift cycle completes.
>
> Ok. That explains it.
>
>>> So for the next 2 hours, clock adjustments in start_adjust() are having a predicted offset of 30 ms or so. When the very long drift removal interval expires, then things go back to normal - offset_sd is now small and the system rapidly recovers.
>>>
>>> I think the drift removal cycle should be restarted if the newly calculated interval is significantly different from the value of current_drift_removal_interval
>>>
>>
>> Alternatively, and much simpler, we set an upper bound on drift_removal_interval of say 200sec
>
> I like the former better. Would it make sense to always restart the
> timer in set_sync_status() instead of waiting for the currently
> running one to expire? I guess it would simplify the code a bit, but
> there would be an extra adjtime() restart on every clock update. I'm
> not sure if it's worth it.
>
Yes - I also think restarting every set_sync_status() could potentially add overhead - especially when the interval is small.
So what is "significantly different"? In the case I saw, the interval went up to 2 hours and that is clearly far too long to wait when the new value is probably less then 10 seconds.
Maybe we should only interrupt the current cycle if current / new > (default / minimum) && current > 60 secs.
Hopefully spikes do not occur frequently (there would be more fundamental problems if they are frequent) and I think 60 secs is not a long time to wait for the current cycle to complete at which time the new drift would be applied.
Also - what caveats are there in stopping a scheduled event? Is it safe to call SCH_RemoveTimeout() at any time?
Regards
Bryan
--
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble? Email listmaster@xxxxxxxxxxxxxxxxxxxx.