Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-40382

[Magma] - Memcached crashes seen on rebalance + collections data load + magma as a backend

    XMLWordPrintable

Details

    Description

      Script to Repro (Non durable tests - Ran only first 9 tests)

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/win10-bucket-ops.ini -p rerun=False,crash_warning=True,quota_percent=95 -c conf/collections/collections_rebalance.conf -m rest'
      

      memcached CRITICAL error

      [ns_server:info,2020-07-09T19:35:25.033-07:00,babysitter_of_ns_1@cb.local:<0.115.0>:ns_port_server:log:224]memcached<0.115.0>: 2020-07-09T19:35:24.831504-07:00 CRITICAL *** Fatal error encountered during exception handling ***
      memcached<0.115.0>: 2020-07-09T19:35:24.831613-07:00 CRITICAL Caught unhandled std::exception-derived exception. what(): GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/magma-kvstore/magma-kvstore.cc: 1841
      memcached<0.115.0>: terminate called after throwing an instance of 'gsl::fail_fast'
      memcached<0.115.0>:   what():  GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/magma-kvstore/magma-kvstore.cc: 1841
       
      [ns_server:info,2020-07-09T19:35:25.747-07:00,babysitter_of_ns_1@cb.local:<0.115.0>:ns_port_server:log:224]memcached<0.115.0>: terminate called recursively
      memcached<0.115.0>: 2020-07-09T19:35:25.536226-07:00 CRITICAL *** Fatal error encountered during exception handling ***
      memcached<0.115.0>: 2020-07-09T19:35:25.543791-07:00 CRITICAL Breakpad caught a crash (Couchbase version 7.0.0-2577). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/d2574a61-6af0-49e8-60e44086-58f8563d.dmp before terminating.
      memcached<0.115.0>: 2020-07-09T19:35:25.543821-07:00 CRITICAL Stack backtrace of crashed thread:
      memcached<0.115.0>: 2020-07-09T19:35:25.562777-07:00 WARNING 3209: Slow operation. {"cid":"127.0.0.1:40751/0","duration":"711 ms","trace":"request=181989532451582:711797","command":"STAT","peer":"127.0.0.1:40751","bucket":"default","packet":{"bodylen":0,"cas":0,"datatype":"raw","extlen":0,"key":"<ud></ud>","keylen":0,"magic":"ClientRequest","opaque":0,"opcode":"STAT","vbucket":0}}
      memcached<0.115.0>: 2020-07-09T19:35:25.548533-07:00 CRITICAL     /opt/couchbase/bin/memcached() [0x400000+0x13ce4d]
      memcached<0.115.0>: 2020-07-09T19:35:25.589556-07:00 CRITICAL     /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ea) [0x400000+0x1523ea]
      memcached<0.115.0>: 2020-07-09T19:35:25.589588-07:00 CRITICAL     /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0xb8) [0x400000+0x152728]
      memcached<0.115.0>: 2020-07-09T19:35:25.589613-07:00 CRITICAL     /lib64/libpthread.so.0() [0x7fa43481e000+0xf630]
      memcached<0.115.0>: 2020-07-09T19:35:25.589682-07:00 CRITICAL     /lib64/libc.so.6(gsignal+0x37) [0x7fa434450000+0x36387]
      memcached<0.115.0>: 2020-07-09T19:35:25.589742-07:00 CRITICAL     /lib64/libc.so.6(abort+0x148) [0x7fa434450000+0x37a78]
      memcached<0.115.0>: 2020-07-09T19:35:25.589831-07:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x125) [0x7fa434f53000+0x91195]
      memcached<0.115.0>: 2020-07-09T19:35:25.589857-07:00 CRITICAL     /opt/couchbase/bin/memcached() [0x400000+0x14d8e2]
      memcached<0.115.0>: 2020-07-09T19:35:25.589941-07:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fa434f53000+0x8ef86]
      memcached<0.115.0>: 2020-07-09T19:35:25.590023-07:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fa434f53000+0x8efd1]
      memcached<0.115.0>: 2020-07-09T19:35:25.590103-07:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fa434f53000+0x8f213]
      memcached<0.115.0>: 2020-07-09T19:35:25.590134-07:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7fa4386b6000+0x221945]
      memcached<0.115.0>: 2020-07-09T19:35:25.590154-07:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7fa4386b6000+0x2222d2]
      memcached<0.115.0>: 2020-07-09T19:35:25.590175-07:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7fa4386b6000+0xed66f]
      memcached<0.115.0>: 2020-07-09T19:35:25.590191-07:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7fa4386b6000+0xed974]
      memcached<0.115.0>: 2020-07-09T19:35:25.590211-07:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7fa4386b6000+0x1a19f8]
      memcached<0.115.0>: 2020-07-09T19:35:25.590229-07:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7fa4386b6000+0x14ded3]
      memcached<0.115.0>: 2020-07-09T19:35:25.590244-07:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7fa4386b6000+0x14467f]
      memcached<0.115.0>: 2020-07-09T19:35:25.590262-07:00 CRITICAL     /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7fa4371d8000+0x10777]
      memcached<0.115.0>: 2020-07-09T19:35:25.590278-07:00 CRITICAL     /lib64/libpthread.so.0() [0x7fa43481e000+0x7ea5]
      memcached<0.115.0>: 2020-07-09T19:35:25.590359-07:00 CRITICAL     /lib64/libc.so.6(clone+0x6d) [0x7fa434450000+0xfe8dd]
      

      bt full on 172.23.98.196 from d2574a61-6af0-49e8-60e44086-58f8563d.dmp

      (gdb) bt full
      #0  0x00007fa434486387 in raise () from /lib64/libc.so.6
      No symbol table info available.
      #1  0x00007fa434487a78 in abort () from /lib64/libc.so.6
      No symbol table info available.
      #2  0x00007fa434fe4195 in __gnu_cxx::__verbose_terminate_handler () at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/vterminate.cc:95
              terminating = false
              t = <optimized out>
      #3  0x000000000054d8e2 in backtrace_terminate_handler () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/utilities/terminate_handler.cc:86
      No locals.
      #4  0x00007fa434fe1f86 in __cxxabiv1::__terminate (handler=<optimized out>) at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:47
      No locals.
      #5  0x00007fa434fe1fd1 in std::terminate () at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:57
      No locals.
       
      #6  0x00007fa434fe2213 in __cxxabiv1::__cxa_throw (obj=obj@entry=0x7fa3d4057460, tinfo=tinfo@entry=0x7fa438c145b0 <typeinfo for gsl::fail_fast>, dest=dest@entry=0x7fa43870dda0 <gsl::fail_fast::~fail_fast()>)
          at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_throw.cc:93
              globals = <optimized out>
              header = 0x7fa3d40573e0
      #7  0x00007fa4388d7945 in fail_fast_assert (message=0x7fa43897a318 "GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/magma-kvstore/magma-kvstore.cc: 1841", 
          cond=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/third_party/gsl-lite/include/gsl/gsl-lite.h:473
      No locals.
      #8  MagmaKVStore::compactDBInternal (this=this@entry=0x7fa3f607be00, ctx=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/magma-kvstore/magma-kvstore.cc:1841
              handle = {_M_t = {
                  _M_t = {<std::_Tuple_impl<0, KVFileHandle*, std::default_delete<KVFileHandle> >> = {<std::_Tuple_impl<1, std::default_delete<KVFileHandle> >> = {<std::_Head_base<1, std::default_delete<KVFileHandle>, true>> = {<std::default_delete<KVFileHandle>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, KVFileHandle*, false>> = {_M_head_impl = 0x7fa3d9a75b30}, <No data fields>}, <No data fields>}}}
              stats = {<std::_Optional_base<Collections::VB::PersistedStats>> = {_M_payload = {{_M_empty = {<No data fields>}, _M_payload = {itemCount = 0, highSeqno = 0, diskSize = 0}}, 
                    _M_engaged = false}}, <std::_Enable_copy_move<true, true, true, true, std::optional<Collections::VB::PersistedStats> >> = {<No data fields>}, <No data fields>}
              collectionKey = {<DocKeyInterface<StoredDocKeyT<std::allocator> >> = {<No data fields>}, keydata = {static npos = 18446744073709551615, 
                  _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7fa3f97efd80 "\001"}, _M_string_length = 14, {_M_local_buf = "\001\000\000_collection\000\026", 
                    _M_allocated_capacity = 7812741926167248897}}}
              key = {keydata = {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7fa3f97efda0 "\001"}, _M_string_length = 14, {
                    _M_local_buf = "\001\000\000_collection\000", _M_allocated_capacity = 7812741926167248897}}}
              keyString = {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7fa3f97efd40 ""}, _M_string_length = 1, {
                  _M_local_buf = "\000\000ombstoX\332N6\244\177\000", _M_allocated_capacity = 8031170902088417280}}
              keySlice = {data = 0x7fa3f97efd80 "\001", len = 14}
              __for_range = @0x7fa3f97efcd0: {<std::_Vector_base<Collections::KVStore::DroppedCollection, std::allocator<Collections::KVStore::DroppedCollection> >> = {
                  _M_impl = {<std::allocator<Collections::KVStore::DroppedCollection>> = {<__gnu_cxx::new_allocator<Collections::KVStore::DroppedCollection>> = {<No data fields>}, <No data fields>}, _M_start = 0x7fa3da8ada50, 
                    _M_finish = 0x7fa3da8ada80, _M_end_of_storage = 0x7fa3da8ada80}}, <No data fields>}
              leb128 = {static maxSize = 5, encodedData = {_M_elems = "\002\000\000\000"}, encodedSize = 1 '\001'}
              prepareSlice = {data = 0x7fa3f97efc92 "\002", len = 1}
              vbid = {vbid = 834}
              dropped = {<std::_Vector_base<Collections::KVStore::DroppedCollection, std::allocator<Collections::KVStore::DroppedCollection> >> = {
                  _M_impl = {<std::allocator<Collections::KVStore::DroppedCollection>> = {<__gnu_cxx::new_allocator<Collections::KVStore::DroppedCollection>> = {<No data fields>}, <No data fields>}, _M_start = 0x7fa3da8ada50, 
                    _M_finish = 0x7fa3da8ada80, _M_end_of_storage = 0x7fa3da8ada80}}, <No data fields>}
              diskState = {status = {s = {_M_t = {
                      _M_t = {<std::_Tuple_impl<0, magma::Status::state*, std::default_delete<magma::Status::state> >> = {<std::_Tuple_impl<1, std::default_delete<magma::Status::state> >> = {<std::_Head_base<1, std::default_delete<magma::Status::state>, true>> = {<std::default_delete<magma::Status::state>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, magma::Status::state*, false>> = {
                            _M_head_impl = 0x0}, <No data fields>}, <No data fields>}}}}, vbstate = {static CurrentVersion = 3, maxDeletedSeqno = {counter = {_M_elems = "\001\000\000\000\000"}}, highSeqno = 25, purgeSeqno = 0, 
                  lastSnapStart = 25, lastSnapEnd = 25, maxCas = 1594348523498504192, hlcCasEpochSeqno = 1, mightContainXattrs = false, supportsNamespaces = true, version = 3, persistedCompletedSeqno = 0, persistedPreparedSeqno = 0, 
                  highPreparedSeqno = 0, maxVisibleSeqno = 25, onDiskPrepares = 0, checkpointType = Memory, transition = {failovers = {static npos = 18446744073709551615, 
                      _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7fa3da8af160 <Address 0x7fa3da8af160 out of bounds>}, _M_string_length = 32, {
                        _M_local_buf = " \000\000\000\000\000\000\000\220\377~\371\243\177\000", _M_allocated_capacity = 32}}, replicationTopology = {m_type = nlohmann::detail::array, m_value = {object = 0x7fa3c968a340, 
                        array = 0x7fa3c968a340, string = 0x7fa3c968a340, boolean = 64, number_integer = 140341435474752, number_unsigned = 140341435474752, number_float = 6.9337881956120976e-310}}, state = vbucket_state_active}}, 
                kvstoreRev = 1}
              compactionCB = {__this = 0x7fa3f607be00, __ctx = {<std::__shared_ptr<compaction_ctx, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<compaction_ctx, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, 
                    _M_ptr = <optimized out>, _M_refcount = {_M_pi = 0x7fa3ea313580}}, <No data fields>}}
              collectionItemsDropped = 0
              localDbReqs = {<std::_Vector_base<MagmaKVStore::MagmaLocalReq, std::allocator<MagmaKVStore::MagmaLocalReq> >> = {
                  _M_impl = {<std::allocator<MagmaKVStore::MagmaLocalReq>> = {<__gnu_cxx::new_allocator<MagmaKVStore::MagmaLocalReq>> = {<No data fields>}, <No data fields>}, _M_start = 0x7fa3c8f3bf60, _M_finish = 0x7fa3c8f3bfa8, 
                    _M_end_of_storage = 0x7fa3c8f3bfa8}}, <No data fields>}
              status = {s = {_M_t = {
                    _M_t = {<std::_Tuple_impl<0, magma::Status::state*, std::default_delete<magma::Status::state> >> = {<std::_Tuple_impl<1, std::default_delete<magma::Status::state> >> = {<std::_Head_base<1, std::default_delete<magma::Status::state>, true>> = {<std::default_delete<magma::Status::state>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, magma::Status::state*, false>> = {_M_head_impl = 0x0}, <No data fields>}, <No data fields>}}}}
      #9  0x00007fa4388d82d2 in MagmaKVStore::compactDB (this=0x7fa3f607be00, ctx=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/magma-kvstore/magma-kvstore.cc:1748
              res = <optimized out>
      ---Type <return> to continue, or q <return> to quit---
      #10 0x00007fa4387a366f in EPBucket::compactInternal (this=0x7fa3f607f000, config=..., purgeSeqno=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/ep_bucket.cc:1142
              ctx = {<std::__shared_ptr<compaction_ctx, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<compaction_ctx, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x7fa3ea313590, _M_refcount = {
                    _M_pi = 0x7fa3ea313580}}, <No data fields>}
              shard = <optimized out>
              result = <optimized out>
              vb = {<std::__shared_ptr<VBucket, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<VBucket, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x7fa433104a00, _M_refcount = {
                    _M_pi = 0x7fa438831745 <KVShard::getBucket(Vbid) const+341>}}, <No data fields>}
      #11 0x00007fa4387a3974 in EPBucket::doCompact (this=0x7fa3f607f000, config=..., purgeSeqno=0, cookie=0x0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/ep_bucket.cc:1192
              err = ENGINE_SUCCESS
              storeProp = {efficientVBDump = StorageProperties::Yes, efficientVBDeletion = StorageProperties::Yes, persistedDeletions = StorageProperties::No, efficientGet = StorageProperties::Yes, concWriteCompact = StorageProperties::Yes, 
                byIdScan = StorageProperties::No}
              vbid = {vbid = 834}
      #12 0x00007fa4388579f8 in CompactTask::run (this=0x7fa3eb67f890) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/tasks.cc:67
              phosphor_internal_category_enabled_57 = {_M_b = {_M_p = 0x0}, static is_always_lock_free = <error reading variable: No global symbol "std::atomic<std::atomic<phosphor::CategoryStatus> const*>::is_always_lock_free".>}
              phosphor_internal_category_enabled_temp_57 = <optimized out>
              phosphor_internal_tpi_57 = {category = 0x29d033 <Address 0x29d033 out of bounds>, name = 0x2bba53 <Address 0x2bba53 out of bounds>, type = phosphor::Complete, argument_names = {_M_elems = {
                    0x2bba5f <Address 0x2bba5f out of bounds>, 0x2ca70b <Address 0x2ca70b out of bounds>}}, argument_types = {_M_elems = {phosphor::is_uint, phosphor::is_none}}}
              phosphor_internal_guard_57 = {tpi = 0x7fa438c13340 <CompactTask::run()::phosphor_internal_tpi_57>, enabled = true, arg1 = 834, arg2 = {<No data fields>}, start = {__d = {__r = 181988680682066}}}
      #13 0x00007fa438803ed3 in GlobalTask::execute (this=0x7fa3eb67f890) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/globaltask.cc:73
              guard = {previous = 0x0}
      #14 0x00007fa4387fa67f in ExecutorThread::run (this=0x7fa4331f4500) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/executorthread.cc:188
              curTaskDescr = {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
                  _M_p = 0x7fa42a320000 <Address 0x7fa42a320000 out of bounds>}, _M_string_length = 19, {_M_local_buf = "\023\000\000\000\000\000\000\000\215q\000\000\000\000\000", _M_allocated_capacity = 19}}
              woketime = <optimized out>
              scheduleOverhead = <optimized out>
              again = <optimized out>
              runtime = <optimized out>
              q = <optimized out>
              tick = 94 '^'
              guard = {engine = 0x0}
      #15 0x00007fa4371e8777 in run (this=0x7fa433e4a070) at /home/couchbase/jenkins/workspace/couchbase-server-unix/platform/src/cb_pthreads.cc:58
      No locals.
      #16 platform_thread_wrap (arg=0x7fa433e4a070) at /home/couchbase/jenkins/workspace/couchbase-server-unix/platform/src/cb_pthreads.cc:71
              context = {_M_t = {
                  _M_t = {<std::_Tuple_impl<0, CouchbaseThread*, std::default_delete<CouchbaseThread> >> = {<std::_Tuple_impl<1, std::default_delete<CouchbaseThread> >> = {<std::_Head_base<1, std::default_delete<CouchbaseThread>, true>> = {<std::default_delete<CouchbaseThread>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, CouchbaseThread*, false>> = {_M_head_impl = 0x7fa433e4a070}, <No data fields>}, <No data fields>}}}
      #17 0x00007fa434825ea5 in start_thread () from /lib64/libpthread.so.0
      No symbol table info available.
      #18 0x00007fa43454e8dd in clone () from /lib64/libc.so.6
      No symbol table info available.
      (gdb) 
      

      Lots of core's of the same stack trace. One of them is d2574a61-6af0-49e8-60e44086-58f8563d.dmp @ 172.23.98.196

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty