[chrony-users] bug report: duplicate server addresses cause chronyd to core

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-users Archives ]


Hello, I don't know if this bug has already been filed, or if this is
improper place to report it.  If this is improper I apologize for the
extra spam.

I believe I have found a bug in chrony 1.27 which causes chronyd to
crash.  Using the attached config file, chrony.conf, and the command
line "chronyd -n -f chrony.conf", I get the output in syslog.txt
(attached) and then a segfault.

After sorting through the code, I was able to identify what I feel to be
the issue - duplicate IP addresses in the list of ntp servers.  The
backtrace in gdb.txt shows that the fault occurs in acquire.c.  It shows
that a timer from timer_queue executed transmit_timeout with a pointer
to a SourceRecord which has been deleted.

I believe it became invalid by this mechanism in acquire.c:

When there are duplicate ntp servers listed on the initstepslew line, 2
SourceRecords are created (sourceA and sourceB), and two timers are
created (timerA and timerB).  When ntp responses are received, only
sourceA is updated because of the way read_from_socket searches for a
matching record.  Eventually, the criteria for sourceA are met, causing
timerA to stop and n_completed_sources to increment.  timerB continues
to trigger, sending ntp poll messages to the ntp server.  Responses from
that server are assigned to sourceA, triggering the criteria for sourceA
and causing n_completed_sources to increment improperly.  Once this
happens enough times, n_complete_sources == number of servers and all
SourceRecords are deleted.  The next time timerB triggers, it attempts
to access sourceB, which was already been deleted, causing the core.

Attached is a patch I used to prevent duplicate IP addresses in the
acquisition list.  Using this patch, I no longer see the segfault.

thanks for your time
-victor

rtcsync
allow 0/0
makestep 0.1 30
cmdallow 127.0.0.1
# initstepslew 0 0.pool.ntp.org 1.pool.ntp.org 2.pool.ntp.org 3.pool.ntp.org 
# server 0.pool.ntp.org iburst minpoll 4 maxpoll 4
# server 1.pool.ntp.org iburst minpoll 4 maxpoll 4
# server 2.pool.ntp.org iburst minpoll 4 maxpoll 4
# server 3.pool.ntp.org iburst minpoll 4 maxpoll 4
# initstepslew 0 174.36.71.205

initstepslew 0 174.36.71.205 50.18.44.198 208.53.158.34 174.36.71.205
server 174.36.71.205 iburst minpoll 4 maxpoll 4
server 50.18.44.198 iburst minpoll 4 maxpoll 4
server 208.53.158.34 iburst minpoll 4 maxpoll 4
Apr 24 12:28:17 chronyd[20696]: chronyd version 1.27-pre1 starting
Apr 24 12:28:17 chronyd[20696]: Linux kernel major=3 minor=2 patch=0
Apr 24 12:28:17 chronyd[20696]: hz=100 shift_hz=7 freq_scale=1.00000000 nominal_tick=10000 slew_delta_tick=833 max_tick_bias=1000 shift_pll=2
Apr 24 12:28:17 chronyd[20696]: Initial frequency 13.225 ppm
Apr 24 12:28:17 chronyd[20696]: System's initial offset : 0.000009 seconds fast of true (step)
(gdb) bt
#0  0x4018dc90 in *__GI_raise (sig=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
#1  0x40193884 in *__GI_abort () at abort.c:92
#2  0x4018646c in *__GI___assert_fail (assertion=<value optimized out>, 
    file=<value optimized out>, line=<value optimized out>, 
    function=<value optimized out>) at assert.c:81
#3  0x00020c64 in probe_source (src=0x1) at acquire.c:303
#4  0x0000b014 in dispatch_timeouts () at sched.c:463
#5  SCH_MainLoop () at sched.c:582
#6  0x0000fa24 in main (argc=<value optimized out>, argv=0xbe9d6e80)
    at main.c:429
(gdb) p init_srcs_ip
$1 = {{
    addr = {
      in4 = 2921613261, 
      in6 = "\315G$\256\001\000\000\000\000\000\000\000\204d\235\276"
    }, 
    family = 1
  }, {
    addr = {
      in4 = 840051910, 
      in6 = "\306,\022\062\001\000\000\000\000\000\000\000\204d\235\276"
    }, 
    family = 1
  }, {
    addr = {
      in4 = 3493174818, 
      in6 = "\"\236\065\320\001\000\000\000\000\000\000\000\204d\235\276"
    }, 
    family = 1
  }, {
    addr = {
      in4 = 2921613261, 
      in6 = "\315G$\256\001\000\000\000\000\000\000\000\204d\235\276"
    }, 
    family = 1
  }, {
    addr = {
      in4 = 0, 
      in6 = '\000' <repeats 15 times>
    }, 
    family = 0
  }, {
    addr = {
      in4 = 0, 
      in6 = '\000' <repeats 15 times>
    }, 
    family = 0
  }, {
    addr = {
      in4 = 0, 
      in6 = '\000' <repeats 15 times>
    }, 
    family = 0
  }, {
    addr = {
      in4 = 0, 
      in6 = '\000' <repeats 15 times>
    }, 
    family = 0
  }}

diff -up chrony-1.27/acquire.c.orig chrony-1.27/acquire.c
--- src/acquire.c.orig	2013-04-24 13:51:38.000000000 -0400
+++ src/acquire.c	2013-04-24 13:52:37.000000000 -0400
@@ -757,6 +757,7 @@ ACQ_StartAcquisition(int n, IPAddr *ip_a
 {
 
   int i, ip4, ip6;
+  int k, duplicate_ip;
 
   saved_after_hook = after_hook;
   saved_after_hook_anything = anything;
@@ -765,18 +766,32 @@ ACQ_StartAcquisition(int n, IPAddr *ip_a
 
   n_started_sources = 0;
   n_completed_sources = 0;
-  n_sources = n;
+  n_sources = 0;
   sources = MallocArray(SourceRecord, n);
 
   for (i = ip4 = ip6 = 0; i < n; i++) {
-    sources[i].ip_addr = ip_addrs[i];
-    sources[i].n_samples = 0;
-    sources[i].n_total_samples = 0;
-    sources[i].n_dead_probes = 0;
-    if (ip_addrs[i].family == IPADDR_INET4)
-      ip4++;
-    else if (ip_addrs[i].family == IPADDR_INET6)
-      ip6++;
+    /* check for duplicate IP addresses and ignore them */
+    duplicate_ip = 0;
+    for (k = 0; k < i; k++)
+    {
+      duplicate_ip |= UTI_CompareIPs(&(sources[k].ip_addr),
+                                     &ip_addrs[i],
+                                     NULL) == 0;
+    }
+    if (!duplicate_ip) {
+      sources[n_sources].ip_addr = ip_addrs[i];
+      sources[n_sources].n_samples = 0;
+      sources[n_sources].n_total_samples = 0;
+      sources[n_sources].n_dead_probes = 0;
+      if (ip_addrs[i].family == IPADDR_INET4)
+        ip4++;
+      else if (ip_addrs[i].family == IPADDR_INET6)
+        ip6++;
+      n_sources++;
+    } else {
+      LOG(LOGS_WARN, LOGF_Acquire, "Ignoring duplicate source: %s",
+          UTI_IPToString(&ip_addrs[i]));
+    }
   }
 
   initialise_io((ip4 && ip6) ? IPADDR_UNSPEC : (ip4 ? IPADDR_INET4 : IPADDR_INET6));


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/