Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-1201

LCB crash following CB server restart for CB versions <6.5

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • .future
    • 3.0.0
    • None
    • 1

    Description

      There are issues with running against 6.0.3 and 5.5.6, that are not occurring with 6.5 (or 7.0 builds).

      The particular error is happening with a Hybrid test (continually running KV sets/gets and View queries) on a Service restart test where we stop the CB server on each node, and then start them all again after 5 secs. The top level errors being reported on Jenkins are:

      03:19:49 [362.67 INFO] (SDKD log:137) SDK version changeset 79e57914fc713e0eb96c583076576a3e44d2fe1a*** Error in `/root/jenkins/workspace/sdk-lcb-situational/sdk-lcb-situational-release/c-sdk-alice-up/sdkd-cpp/sdkd_lcb': corrupted double-linked list: 0x00007f6b54077c60 ***

       

      Running this locally on my Mac I see a different, but probably related error:

      sdkd_lcb(96018,0x70000edcf000) malloc: *** error for object 0x7f9c1962001d: pointer being freed was not allocated

       

      And running with debug I get a stack of:

       

      __pthread_kill 0x00007fff771cd2c2

      pthread_kill 0x00007fff77288bf1

      abort 0x00007fff771376a6

      malloc_vreport 0x00007fff77246077

      malloc_report 0x00007fff77245e38

      mcreq_wipe_packet mcreq.c:276

      mcreq_packet_done mcreq.c:888

      mcreqpktflush_callback(void*, unsigned int, void*) mcreq-flush-inl.h:66

      netbuf_end_flush2 netbuf.c:655

      mcreq_flush_done_ex(mc_pipeline_st*, unsigned int, unsigned int, unsigned long long) mcreq-flush-inl.h:93

      mcreq_flush_done(mc_pipeline_st*, unsigned int, unsigned int) mcreq-flush-inl.h:103

      lcb::Server::finalize_errored_ctx() mcserver.cc:1153

      lcb::Server::start_errored_ctx(lcb::Server::State) mcserver.cc:1122

      lcb::Server::socket_failed(lcb_STATUS) mcserver.cc:1062

      on_error(lcbio_CTX*, lcb_STATUS) mcserver.cc:1048

      err_handler ctx.c:44

      timer_callback timer.c:43

      event_process_active_single_queue 0x000000010a660c4a

      event_base_loop 0x000000010a65de76

      ::lcb_wait(lcb_INSTANCE *, lcb_WAITFLAGS) wait.cc:109

      CBSdkd::Handle::postsubmit(CBSdkd::ResultSet&, unsigned int) Handle.cpp:427

      CBSdkd::Handle::dsMutate(CBSdkd::Command, CBSdkd::Dataset const&, CBSdkd::ResultSet&, CBSdkd::ResultOptions const&) Handle.cpp:584

      CBSdkd::WorkerDispatch::processRequest(CBSdkd::Request const&) Worker.cpp:177

      CBSdkd::WorkerDispatch::selectLoop() Worker.cpp:308

      CBSdkd::WorkerDispatch::run() Worker.cpp:330

      CBSdkd::new_worker_thread(void*) Control.cpp:70

      _pthread_body 0x00007fff772862eb

      _pthread_start 0x00007fff77289249

      thread_start 0x00007fff7728540d

       

       

      Afaict this is an LCB issue. Let me know if you know whats going on, or you need anymore details from me.

      Attachments

        For Gerrit Dashboard: CCBC-1201
        # Subject Branch Project Status CR V

        Activity

          People

            avsej Sergey Avseyev
            will.broadbelt Will Broadbelt
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty