We have chrony 3.2 clients on 5 sites, each synchronised with 3 sources :
- Source 1 : An high performance asic-based ntp server with server-side hw timestamping
- Source 2 : A 'backup' software-based ntp server
- Source 3 : A (shitty) linux ntpd
Source 1 & 2 are from a stratum 1 time server deployed on each site.
All our client run on a recent kernel and have nics that have at least TX hw timestamping features,
timestamping
should be at least hardware/kernel. In this mode, peer delay is around 12us normally with source 1 (only 1 switch between the clients and the timeserver)
I recently investigated some server on a site that struggle to use HW timestamp and spend most of their time in deamon/deamon mode instead of hardware/kernel. The weird part is that it occurs only with source 1, the client kernels are stable with hw/kernel on the less acurate sources.
Turns out I have this problem only with my high performance source, the 'slower' sources being fine.
Running chrony in debug mode i have messages such as
2018-01-26T16:46:26Z sys_linux.c:755:(get_precise_phc_sample) ioctl(PTP_SYS_OFFSET_PRECISE) failed : Operation not supported
2018-01-26T16:46:26Z hwclock.c:171:(HCL_AccumulateSample) HW clock needs more samples
2018-01-26T16:46:26Z ntp_io_linux.c:653:(NIO_Linux_ProcessMessage) Received 90 (48) bytes from error queue for
10.214.16.11:123 fd=13 if=2 tss=0
2018-01-26T16:46:26Z ntp_io_linux.c:665:(NIO_Linux_ProcessMessage) Missing TX timestamp
I tried to run chrony in debug mode in another site where I have no problems to compare the logs, but when i run chronyd as root (debug mode or not) with the exact same command that systemd launches it, then I have the exact same problem (timestamp reported deamon/deamon with source 1 and fine with the others)