RE: [chrony-users] initstepslew seems to break chronyd

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]


Hello Miroslav,

Okay, I guess I have to explain a little more:
The machine and its peer are Test-VMs. Something in the VM-Environment (possibly a backup) causes timing problems almost every night.
As you can see in the measurement log, around 1:19 on offset of almost a second appeared out of the blue. After that the two peers are free running.
I will check whether adding orphan to the local line allows the machines to recover from the situation automatically.

Around 15:00 I started examining the problem and restarted chronyd several times. The restart at 15:18 was without initstepslew. One of the attempts before was with orphan added to local.

The part that really surprised me was that initstepslew not only failed to select the server the better source, but also affected the daemon's long term behaviour.

Regards,
Frank

-----Original Message-----
From: Miroslav Lichvar [mailto:mlichvar@xxxxxxxxxx] 
Sent: Freitag, 1. Februar 2019 10:50
To: chrony-users@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [chrony-users] initstepslew seems to break chronyd

On Thu, Jan 31, 2019 at 04:48:00PM +0000, MUZZULINI Frank wrote:
> local stratum 11
> server 10.4.142.10 minpoll 5 maxpoll 8 iburst prefer
> peer   10.5.141.208 minpoll 5 maxpoll 8 iburst
> initstepslew 0 10.4.142.10 10.5.141.208
> 
> Both local time and the peer are about two seconds off.

Does that mean that the peer is synchronized to this host? It doesn't have more sources to detect falsetickers?

Also, please note that iburst doesn't work with peers, so the server would normally be selectable before the peer. In initstepslew the burst works with both sources, so there is a difference.

> When I start chronyd with this configuration, I end up with both sources being considered false tickers:
> # chronyc -n sources -v
> MS Name/IP address         Stratum Poll Reach LastRx Last sample
> ===============================================================================
> ^x 10.4.142.10                   2   5   377    15  -2010ms[-2010ms] +/- 5384us
> =x 10.5.141.208                 11   7   377    60    +26ms[  +26ms] +/-  282us
> 
> However, when I comment out the initstepslew line and restart chronyd, it quickly selects the stratum 2 server as the relevant source and everything is fine.
> 
> Is this a bug?
> Any ideas what I could do to avoid this problem?
> I don't want to use makestep, since time jumps at run time could kill our application.

Makestep shouldn't make a difference. I'm not sure what is going on here. It would help to see the measurements and tracking logs, from both peers.

I'd suggest to try also the orphan option and ideally use a third source, so a single falseticker doesn't break everything.

--
Miroslav Lichvar

--
To unsubscribe email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "unsubscribe" in the subject.
For help email chrony-users-request@xxxxxxxxxxxxxxxxxxxx
with "help" in the subject.
Trouble?  Email listmaster@xxxxxxxxxxxxxxxxxxxx.

Attachment: measurements.log.bz2
Description: measurements.log.bz2

Attachment: statistics.log.bz2
Description: statistics.log.bz2

Attachment: tracking.log.bz2
Description: tracking.log.bz2



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/