Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-539

libcouchbase client crashes on stats when server stops

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 2.4.4
    • 2.4.3
    • library
    • Security Level: Public
    • None
    • CentOS 6.6 x86_64, Couchbase-server-enterprise 3.0.1, libcouchbase2-libevent-2.4.3

    Description

      My application periodically retrieves the server stats via lcb_server_stats(). When one of the Couchbase servers is stopped or restarted ($ /etc/init.d/couchbase-server restart) the application often crashes.

      It seems that a LCB_NETWORK_ERROR is not properly handle while collecting the stats from the servers in the cluster.
      Program terminated with signal 11, Segmentation fault.
      Stack trace:
      #0 0x00007f25c7206e3c in vfprintf () from /lib64/libc.so.6
      #1 0x00007f25c7228619 in vsprintf () from /lib64/libc.so.6
      #2 0x00007f25c720e2c8 in sprintf () from /lib64/libc.so.6
      #3 0x00007f25c8ff4530 in stats_handler (pl=<value optimized out>, req=<value optimized out>, err=LCB_NETWORK_ERROR, arg=0x0)
      at src/operations/stats.c:45
      #4 0x00007f25c90006b4 in H_verbosity (pipeline=0x7f25a8a1c950, req=0x7f254c0c5ae0, res=0x7f25a8a1cba0, immerr=<value optimized out>)
      at src/handler.c:413
      #5 mcreq_dispatch_response (pipeline=0x7f25a8a1c950, req=0x7f254c0c5ae0, res=0x7f25a8a1cba0, immerr=<value optimized out>)
      at src/handler.c:564
      #6 0x00007f25c8ff0933 in bail_op (rq=0x2311100, op=0x7f254c0c9b10, err=LCB_ETIMEDOUT) at src/retryq.c:145
      #7 0x00007f25c8ff0e1d in rq_flush (rq=0x2311100, throttle=1) at src/retryq.c:210
      #8 0x00007f25c8fef104 in timer_callback (sock=<value optimized out>, which=<value optimized out>, arg=0x23d83a0)
      at src/lcbio/timer.c:45
      #9 0x00007f25c9234b44 in event_base_loop () from /usr/lib64/libevent-1.4.so.2
      #10 0x000000000048e42c in con_thr (arg=0x23d4738) at src/store_couchbase.c:2946
      #11 0x00007f25c755a9d1 in start_thread () from /lib64/libpthread.so.0
      #12 0x00007f25c72a786d in clone () from /lib64/libc.so.6

      The SEGV occurs in operations/stats.c:45 :
      sprintf(epbuf, "%s:%s", mcserver_get_host(server), mcserver_get_port(server));

      Building the client lib from source w/o optimisation provides a bit more insight:
      (gdb) p server
      $3 = (mc_SERVER *) 0x7f8d77ffe970
      (gdb) p *server
      $4 = {pipeline = {requests =

      {first = 0x0, last = 0x0}, parent = 0x112c138, flush_start = 0, index = 0, ctxqueued = {first = 0x0, last = 0x0}, buf_done_callback = 0, nbmgr = {sendq = {pending = {first = 0x0, last = 0x0}

      , pdus =

      {first = 0x0, last = 0x0},
      last_requested = 0x0, last_offset = 0, pdu_offset = 0, elempool = {active = {first = 0x0, last = 0x0}

      , avail =

      { first = 0x0, last = 0x0}

      , basealloc = 0, maxblocks = 0, curblocks = 0, cacheblocks = 0x0, ncacheblocks = 0,
      mgr = 0x0}}, datapool = {active =

      {first = 0x0, last = 0x0}, avail = {first = 0x0, last = 0x0}

      , basealloc = 0,
      maxblocks = 0, curblocks = 0, cacheblocks = 0x0, ncacheblocks = 0, mgr = 0x0}, settings = {sndq_cacheblocks = 0,
      sndq_basealloc = 0, dea_cacheblocks = 0, dea_basealloc = 0, data_cacheblocks = 0, data_basealloc = 0}}, reqpool = {sendq = {
      pending =

      {first = 0x0, last = 0x0}, pdus = {first = 0x0, last = 0x0}

      , last_requested = 0x0, last_offset = 0,
      pdu_offset = 0, elempool = {active =

      {first = 0x0, last = 0x0}, avail = {first = 0x0, last = 0x0}

      , basealloc = 0,
      maxblocks = 0, curblocks = 0, cacheblocks = 0x0, ncacheblocks = 0, mgr = 0x0}}, datapool = {active =

      {first = 0x0, last = 0x0}

      , avail =

      {first = 0x0, last = 0x0}

      , basealloc = 0, maxblocks = 0, curblocks = 0, cacheblocks = 0x0,
      ncacheblocks = 0, mgr = 0x0}, settings =

      {sndq_cacheblocks = 0, sndq_basealloc = 0, dea_cacheblocks = 0, dea_basealloc = 0, data_cacheblocks = 0, data_basealloc = 0}

      }}, datahost = 0x0, viewshost = 0x0, resthost = 0x0, instance = 0x112c130,
      settings = 0x0, state = 0, compsupport = 0, io_timer = 0x0, connctx = 0x0, connreq = {type = 0, u =

      {cs = 0x0, preq = 0x0, p_generic = 0x0}

      , dtor = 0}, curhost = 0x0}
      (gdb) p server->curhost
      $5 = (lcb_host_t *) 0x0

      Obviously retrieving the host and port number from a NULL-pointer doesn't fly.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mnunberg Mark Nunberg (Inactive)
            penacho Robert Groenenberg
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty