Re: [chrony-dev] chronyd not recovering after time stepped.

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]


> On 28/08/2015, at 7:57 pm, Miroslav Lichvar <mlichvar@xxxxxxxxxx> wrote:
> 
> On Fri, Aug 28, 2015 at 04:49:07PM +1200, Bryan Christianson wrote:
>>> I think I figured it out. When the time is stepped there is a big spike in offset_sd. This causes the drift removal interval to be increased to a very large value (aprox 2 hours in the trace I just looked at) and then applied when the current drift cycle completes.
> 
> Ok. That explains it.
> 
>>> So for the next 2 hours, clock adjustments in start_adjust() are having a predicted offset of 30 ms or so. When the very long drift removal interval expires, then things go back to normal - offset_sd is now small and the system rapidly recovers.
>>> 
>>> I think the drift removal cycle should be restarted if the newly calculated interval is significantly different from the value of current_drift_removal_interval
>>> 
>> 
>> Alternatively, and much simpler, we set an upper bound on drift_removal_interval of say 200sec
> 
> I like the former better. Would it make sense to always restart the
> timer in set_sync_status() instead of waiting for the currently
> running one to expire? I guess it would simplify the code a bit, but
> there would be an extra adjtime() restart on every clock update. I'm
> not sure if it's worth it.
> 

Yes - I also think restarting every set_sync_status() could potentially add overhead - especially when the interval is small.

So what is "significantly different"? In the case I saw, the interval went up to 2 hours and that is clearly far too long to wait when the new value is probably less then 10 seconds. 

Maybe we should only interrupt the current cycle if current / new > (default / minimum) && current > 60 secs. 

Hopefully spikes do not occur frequently (there would be more fundamental problems if they are frequent) and I think 60 secs is not a long time to wait for the current cycle to complete at which time the new drift would be applied.

Also - what caveats are there in stopping a scheduled event? Is it safe to call SCH_RemoveTimeout() at any time?

Regards
Bryan



--
To unsubscribe email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "unsubscribe" in the subject.
For help email chrony-dev-request@xxxxxxxxxxxxxxxxxxxx with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/