Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-1201

LCB crash following CB server restart for CB versions <6.5

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • .future
    • 3.0.0
    • None
    • 1

    Description

      There are issues with running against 6.0.3 and 5.5.6, that are not occurring with 6.5 (or 7.0 builds).

      The particular error is happening with a Hybrid test (continually running KV sets/gets and View queries) on a Service restart test where we stop the CB server on each node, and then start them all again after 5 secs. The top level errors being reported on Jenkins are:

      03:19:49 [362.67 INFO] (SDKD log:137) SDK version changeset 79e57914fc713e0eb96c583076576a3e44d2fe1a*** Error in `/root/jenkins/workspace/sdk-lcb-situational/sdk-lcb-situational-release/c-sdk-alice-up/sdkd-cpp/sdkd_lcb': corrupted double-linked list: 0x00007f6b54077c60 ***

       

      Running this locally on my Mac I see a different, but probably related error:

      sdkd_lcb(96018,0x70000edcf000) malloc: *** error for object 0x7f9c1962001d: pointer being freed was not allocated

       

      And running with debug I get a stack of:

       

      __pthread_kill 0x00007fff771cd2c2

      pthread_kill 0x00007fff77288bf1

      abort 0x00007fff771376a6

      malloc_vreport 0x00007fff77246077

      malloc_report 0x00007fff77245e38

      mcreq_wipe_packet mcreq.c:276

      mcreq_packet_done mcreq.c:888

      mcreqpktflush_callback(void*, unsigned int, void*) mcreq-flush-inl.h:66

      netbuf_end_flush2 netbuf.c:655

      mcreq_flush_done_ex(mc_pipeline_st*, unsigned int, unsigned int, unsigned long long) mcreq-flush-inl.h:93

      mcreq_flush_done(mc_pipeline_st*, unsigned int, unsigned int) mcreq-flush-inl.h:103

      lcb::Server::finalize_errored_ctx() mcserver.cc:1153

      lcb::Server::start_errored_ctx(lcb::Server::State) mcserver.cc:1122

      lcb::Server::socket_failed(lcb_STATUS) mcserver.cc:1062

      on_error(lcbio_CTX*, lcb_STATUS) mcserver.cc:1048

      err_handler ctx.c:44

      timer_callback timer.c:43

      event_process_active_single_queue 0x000000010a660c4a

      event_base_loop 0x000000010a65de76

      ::lcb_wait(lcb_INSTANCE *, lcb_WAITFLAGS) wait.cc:109

      CBSdkd::Handle::postsubmit(CBSdkd::ResultSet&, unsigned int) Handle.cpp:427

      CBSdkd::Handle::dsMutate(CBSdkd::Command, CBSdkd::Dataset const&, CBSdkd::ResultSet&, CBSdkd::ResultOptions const&) Handle.cpp:584

      CBSdkd::WorkerDispatch::processRequest(CBSdkd::Request const&) Worker.cpp:177

      CBSdkd::WorkerDispatch::selectLoop() Worker.cpp:308

      CBSdkd::WorkerDispatch::run() Worker.cpp:330

      CBSdkd::new_worker_thread(void*) Control.cpp:70

      _pthread_body 0x00007fff772862eb

      _pthread_start 0x00007fff77289249

      thread_start 0x00007fff7728540d

       

       

      Afaict this is an LCB issue. Let me know if you know whats going on, or you need anymore details from me.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            avsej Sergey Avseyev
            will.broadbelt Will Broadbelt
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty