Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-1445

Spurious segfaults due to query metrics

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.2.1
    • library
    • None
    • 1
    • SDK32: Eventing Mgmt, SDK34: CB++, OBO, Connect Prep

    Description

      It appears that under certain circumstances, lcb_QUERY_HANDLE's are left active after the associated instance is destroyed. This causes the query destructor which indirectly calls the metrics interface to fail and segfault. Interestingly, there is a related bug I've seen (but also cannot reliably reproduce) where executed queries never invoke the callback, even though wireshark shows the full response was received.

      * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x13c)
          frame #0: 0x000000010c40f875 couchbase_impl.node`record_http_op_latency(char const*, char const*, lcb_st*, unsigned long long) [inlined] record_op_latency(op=<unavailable>, svc=<unavailable>, settings=0x0000000000000000, start=<unavailable>) at metrics-internal.cc:41:9 [opt]
         38
         39  	void record_op_latency(const char *op, const char *svc, lcb_settings_st *settings, hrtime_t start)
         40  	{
      -> 41  	    if (settings->op_metrics_enabled && settings->meter) {
         42  	        lcbmetrics_TAG tags[2] = {{METRICS_SVC_TAG_NAME, svc ? svc : ""}, {METRICS_OP_TAG_NAME, op ? op : ""}};
         43  	        auto recorder = settings->meter->value_recorder_(settings->meter, METRICS_OPS_METER_NAME, tags, 2);
         44  	        if (recorder) {
      Target 0: (node) stopped.
      (lldb) bt
      * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x13c)
        * frame #0: 0x000000010c40f875 couchbase_impl.node`record_http_op_latency(char const*, char const*, lcb_st*, unsigned long long) [inlined] record_op_latency(op=<unavailable>, svc=<unavailable>, settings=0x0000000000000000, start=<unavailable>) at metrics-internal.cc:41:9 [opt]
          frame #1: 0x000000010c40f875 couchbase_impl.node`record_http_op_latency(op=0x0000000000000000, svc="query", instance=0x000000010314ed30, start=0) at metrics-internal.cc:62 [opt]
          frame #2: 0x000000010c41e7a2 couchbase_impl.node`lcb_QUERY_HANDLE_::invoke_row(this=0x0000000102f87880, resp=0x00007ffeefbff150, is_last=<unavailable>) at query_handle.cc:223:13 [opt]
          frame #3: 0x000000010c421061 couchbase_impl.node`lcb_QUERY_HANDLE_::~lcb_QUERY_HANDLE_(this=0x0000000102f87880) at query_handle.cc:555:9 [opt]
          frame #4: 0x000000010c4213be couchbase_impl.node`lcb_QUERY_HANDLE_::~lcb_QUERY_HANDLE_() [inlined] lcb_QUERY_HANDLE_::~lcb_QUERY_HANDLE_(this=0x0000000102f87880) at query_handle.cc:552:1 [opt]
          frame #5: 0x000000010c4213b9 couchbase_impl.node`lcb_QUERY_HANDLE_::~lcb_QUERY_HANDLE_(this=0x0000000102f87880) at query_handle.cc:552 [opt]
          frame #6: 0x000000010c40dbd1 couchbase_impl.node`timer_callback(sock=<unavailable>, which=<unavailable>, arg=0x0000000102f67e40) at timer.cc:43:5 [opt]
       
      (lldb) f 1
      frame #1: 0x000000010c40f875 couchbase_impl.node`record_http_op_latency(op=0x0000000000000000, svc="query", instance=0x000000010314ed30, start=0) at metrics-internal.cc:62 [opt]
         59
         60  	void record_http_op_latency(const char *op, const char *svc, lcb_INSTANCE *instance, hrtime_t start)
         61  	{
      -> 62  	    record_op_latency(op, svc, instance->settings, start);
         63  	}
      (lldb) p instance->settings
      (lcb_settings *) $0 = nullptr
       
      (lldb) p *instance
      (lcb_INSTANCE) $2 = {
        cmdq = {
          pipelines = nullptr
          scheds = 0xd000000010318118 ""
          ctxenter = 8
          npipelines = 0
          _npipelines_ex = 0
          seq = 0
          config = nullptr
          cqdata = 0x0000000000000000
          fallback = nullptr
        }
        cookie = 0x0000000000000000
        confmon = nullptr
        mc_nodes = nullptr
        ht_nodes = nullptr
        cur_configinfo = nullptr
        bs_state = nullptr
        callbacks = {
          v3callbacks = {
            [0] = 0x0000000000000000
            [1] = 0x0000000000000000
            [2] = 0x0008000000000000 (0x0008000000000000)
            [3] = 0x0000000000000002 (0x0000000000000002)
            [4] = 0x0000000000000000
            [5] = 0x000000010c3ed460 (couchbase_impl.node`iops_lcb_dtor(lcb_io_opt_st*) at plugin-libuv.c:40)
            [6] = 0x0000000000000000
            [7] = 0x0000000000000059 (0x0000000000000059)
            [8] = 0x000000010c3ed4f0 (couchbase_impl.node`wire_iops2(int, lcb_loopprocs_st*, lcb_timerprocs_st*, lcb_bsdprocs_st*, lcb_evprocs_st*, lcb_completion_procs*, lcb_iomodel_t*) at plugin-libuv.c:630)
            [9] = 0x00000001031252b0 (0x00000001031252b0)
            [10] = 0x0000000000000000
            [11] = 0x0000000000000000
            [12] = 0x0000000000000000
            [13] = 0x0000000000000000
            [14] = 0x0000000000000000
            [15] = 0x0000000000000000
            [16] = 0x0000000000000000
            [17] = 0x0000000000000000
            [18] = 0x0000000000000000
            [19] = 0x0000000000000000
            [20] = 0x0000000000000000
            [21] = 0x0000000000000000
            [22] = 0x0000000000000000
            [23] = 0x0000000000000000
            [24] = 0x0000000000000000
          }
          errmap = 0x0000000000000000
          bootstrap = 0x0000000000000000
          pktfwd = 0x0000000101ce4d40 (node`default_loop_struct)
          pktflushed = 0x0000000100000001 (node`_mh_execute_header + 1)
          open = 0x0000000000000001 (0x0000000000000001)
        }
        kv_timings = nullptr
        pendops = {
          items = {
            [0] = 0x000000010315f460 size=4294967295
            [1] = 0x0000000101ce4d40 size=4294967295
            [2] = 0x0000000000000009 size=0
          }
          count = 0
        }
        wait = 53045520
        memd_sockpool = 0x000000010313b320
        http_sockpool = nullptr
        last_error = LCB_SUCCESS
        settings = nullptr
        iotable = nullptr
        retryq = nullptr
        scratch = ""
        vbguess = 0x000000010c3e93a0
        n1ql_cache = 0x000000010323cba8
        dcpinfo = 0x0000000101ce4f68
        dtor_timer = 0x100000001031667d
        btype = 0x1031149c
        collcache = 0xd000000010316b30
        crypto = 0x000000010c3eae10 size=4294967295
        deferred_operations = 0x000000010314a6b0 size=4294967295
      }
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            brett19 Brett Lawson created issue -
            avsej Sergey Avseyev made changes -
            Field Original Value New Value
            Status New [ 10003 ] Open [ 1 ]
            avsej Sergey Avseyev made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            avsej Sergey Avseyev made changes -
            Attachment main.c [ 155008 ]
            avsej Sergey Avseyev made changes -
            Sprint SDK32: Eventing Mgmt [ 1590 ]
            ray.cardillo Ray Cardillo made changes -
            Sprint SDK32: Eventing Mgmt [ 1590 ] SDK32: Eventing Mgmt, SDK34: CB++, OBO [ 1590, 1717 ]
            avsej Sergey Avseyev made changes -
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]
            brett19 Brett Lawson made changes -
            Link This issue blocks JSCBC-915 [ JSCBC-915 ]

            People

              avsej Sergey Avseyev
              brett19 Brett Lawson
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty