Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6010

Memcached constantly crashes (exit with139) on destination cluster, during xdcr replication from source cluster and one node on source is rebooted.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 2.0
    • 2.0
    • couchbase-bucket, XDCR
    • Security Level: Public
    • None

    Description

      Setup:
      -----------------
      1. Setup unidirectional replication from source to destination cluster.
      2. Loaded 2 M items on source, replicated 2M items to destination cluster.-ok.
      3. Start mutating load on the source cluster [includes, expires/deletes on these mutations].
      4. Reboot one node on the source cluster.
      5. Items are being replicated as expected.
      6. Node on source cluster is warmed up and source cluster looks healthy.

      Output
      -----------------
      Memcached is crashing continuously on 1 node on desintaiotn cluster ( node 10.3.3.34)
      Memcached crashed on other 2 nodes of the destination cluster.

      Stack Trace from node 10.3.3.34
      =----------------------------------------------
      Thread 2 (Thread 1853):
      #0 0x00007f106b18685c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
      #1 0x00007f1067115a17 in SyncObject::wait (this=0x55b96c0) at syncobject.hh:36
      #2 Dispatcher::run (this=0x55b96c0) at dispatcher.cc:89
      #3 0x00007f106711600b in launch_dispatcher_thread (arg=0x55b9714) at dispatcher.cc:28
      #4 0x00007f106b1819ca in start_thread () from /lib/libpthread.so.0
      #5 0x00007f106aedecdd in clone () from /lib/libc.so.6
      #6 0x0000000000000000 in ?? ()

      Thread 1 (Thread 1852):
      #0 std::string::length (db=<value optimized out>, docinfo=<value optimized out>, ctx=<value optimized out>)
      at /usr/include/c++/4.4/bits/basic_string.h:630
      #1 CouchKVStore::getMultiCb (db=<value optimized out>, docinfo=<value optimized out>, ctx=<value optimized out>)
      at couch-kvstore/couch-kvstore.cc:1720
      #2 0x00007f1066eda5e2 in lookup_callback (rq=<value optimized out>, k=<value optimized out>, v=<value optimized out>) at src/couch_db.c:583
      #3 0x00007f1066ed9859 in btree_lookup_inner (rq=0x7f1065abe410, diskpos=<value optimized out>, current=22, end=26) at src/btree_read.c:73
      #4 0x00007f1066ed975a in btree_lookup_inner (rq=0x7f1065abe410, diskpos=<value optimized out>, current=18, end=41) at src/btree_read.c:47
      #5 0x00007f1066ed975a in btree_lookup_inner (rq=0x7f1065abe410, diskpos=<value optimized out>, current=0, end=121) at src/btree_read.c:47
      #6 0x00007f1066edadb8 in iterate_docinfos (db=0x5572310, sequence=<value optimized out>, numDocs=121, options=<value optimized out>,
      callback=<value optimized out>, ctx=<value optimized out>) at src/couch_db.c:688
      #7 couchstore_docinfos_by_sequence (db=0x5572310, sequence=<value optimized out>, numDocs=121, options=<value optimized out>,
      callback=<value optimized out>, ctx=<value optimized out>) at src/couch_db.c:729
      #8 0x00007f10671b7d63 in CouchKVStore::getMulti (this=<value optimized out>, vb=343, itms=<value optimized out>)
      at couch-kvstore/couch-kvstore.cc:464
      #9 0x00007f10671b1896 in batchWarmupCallback (vbId=<value optimized out>, fetches=<value optimized out>, arg=0x7f1065abe840)
      at couch-kvstore/couch-kvstore.cc:176
      #10 0x00007f1067186de6 in MutationLogHarvester::apply (this=<value optimized out>, arg=<value optimized out>, mlc=<value optimized out>)
      at mutation_log.cc:652
      #11 0x00007f10671b2081 in CouchKVStore::warmup (this=<value optimized out>, lf=<value optimized out>, vbmap=<value optimized out>,
      cb=<value optimized out>, estimate=<value optimized out>) at couch-kvstore/couch-kvstore.cc:1780
      #12 0x00007f1067182480 in Warmup::loadingAccessLog (this=0x55740f0) at warmup.cc:458
      #13 0x00007f1067183262 in Warmup::step (this=0x55740f0, d=..., t=<value optimized out>) at warmup.cc:554
      #14 0x00007f1067184199 in WarmupStepper::callback(Dispatcher&, std::tr1::shared_ptr<Task>) () from /opt/couchbase/lib/memcached/ep.so
      #15 0x00007f10671165cf in Task::run (this=<value optimized out>, d=..., t=<value optimized out>) at dispatcher.hh:139
      #16 0x00007f10671157da in Dispatcher::run (this=0x55b9880) at dispatcher.cc:123
      #17 0x00007f106711600b in launch_dispatcher_thread (arg=0x55dfcc0) at dispatcher.cc:28
      #18 0x00007f106b1819ca in start_thread () from /lib/libpthread.so.0

      Note : The core files on these nodes is at /opt/couchbase/var/lib/couchbase/

      Please let me know if you need additional information.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            liang Liang Guo (Inactive)
            ketaki Ketaki Gangal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty