Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58352

getVBucketHLCNow as used by magma callback is not safe for shutdown/vbucket state changes

    XMLWordPrintable

Details

    Description

      QE-TEST:

      ./sequoia -client 172.23.104.254:2375 -provider file:centos_third_cluster.yml -test tests/integration/7.6/test_7.6.yml -scope tests/integration/7.6/scope_7.6_magma.yml -scale 3 -repeat 0 -log_level 0 -version 7.6.0-1394 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      Below Core Dump is found on node 172.23.108.141(a68938d3-cd55-41c6-994a9381-5641b8e9.dmp) , First seg fault was at 10:20 am, dmp name : a68938d3-cd55-41c6-994a9381-5641b8e9.dmp
      BackTrace:

      (gdb) bt
      #0  0x00000000007b5eaf in load (__m=std::memory_order_seq_cst, this=0xb80)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/hlc.cc:20
      #1  HLCT<std::chrono::_V2::system_clock>::peekHLC (this=0xb80)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/hlc.cc:27
      #2  0x000000000077bb74 in getHLCNow (this=<optimized out>)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/vbucket.h:301
      #3  EventuallyPersistentEngine::getVBucketHlcNow(Vbid) ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/ep_engine.cc:1773
      #4  0x000000000099a78f in getHistoryTimeNow (kvid=<optimized out>, engine=<optimized out>)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/kvstore/magma-kvstore/magma-kvstore.cc:194
      #5  operator() (kvid=<optimized out>, __closure=<optimized out>)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/kvstore/magma-kvstore/magma-kvstore.cc:631
      #6  __invoke_impl<std::chrono::duration<long int>, MagmaKVStore::MagmaKVStore(MagmaKVStoreConfig&)::<lambda(magma::Magma::KVStoreID)>&, short unsigned int> (__f=...) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:60
      #7  __invoke_r<std::chrono::duration<long int>, MagmaKVStore::MagmaKVStore(MagmaKVStoreConfig&)::<lambda(magma::Magma::KVStoreID)>&, short unsigned int> (__fn=...) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:113
      #8  std::_Function_handler<std::chrono::duration<long int, std::ratio<1, 1> >(short unsigned int), MagmaKVStore::MagmaKVStore(MagmaKVStoreConfig&)::<lambda(magma::Magma::KVStoreID)> >::_M_invoke(const std::_Any_data &, <unknown type in /usr/lib/debug/opt/couchbase/bin/memcached-7.6.0-1394.x86_64.debug, CU 0x7bab33c, DIE 0x7d44256>) (__functor=..., __args#0=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/std_function.h:291
      #9  0x0000000000a6c078 in operator() (__args#0=<optimized out>, this=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/std_function.h:248
      #10 canTimeRetainHistory (kvid=<optimized out>, time=..., this=<optimized out>)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:1950
      #11 operator() (time=..., __closure=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:306
      #12 __invoke_impl<bool, magma::Magma::Impl::createKVStore(magma::Magma::KVStoreID, magma::Magma::KVStoreRevision, magma::KVStoreHandle&)::<lambda()>::<lambda(std::chrono::seconds)>&, std::chrono::duration<long int, std::ratio<1, 1> > > (__f=...) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:60
      #13 __invoke_r<bool, magma::Magma::Impl::createKVStore(magma::Magma::KVStoreID, magma::Magma::KVStoreRevision, magma::KVStoreHandle&)::<lambda()>::<lambda(std::chrono::seconds)>&, std::chrono::duration<long int, std::ratio<1, 1> > > (__fn=...) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:113
      #14 std::_Function_handler<bool(std::chrono::duration<long int, std::ratio<1, 1> >), magma::Magma::Impl::createKVStore(magma::Magma::KVStoreID, magma::Magma::KVStoreRevision, magma::KVStoreHandle&)::<lambda()>::<lambda(std::chrono::seconds)> >::_M_invoke(const std::_Any_data &, <unknown type in /usr/lib/debug/opt/couchbase/bin/memcached-7.6.0-1394.x86_64.debug, CU 0x8f64add, DIE 0x9097cf1>) (__functor=..., __args#0=<optimized out>)
          at /opt/gcc-10.2.0/include/c++/10.2.0/bits/std_function.h:291
      #15 0x0000000000b5e503 in operator() (__args#0=..., this=0x7f0c831fe890) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/std_function.h:248
      #16 magma::LSMTree::recomputeHistorySSTableState(std::shared_ptr<magma::TreeSnapshot>) ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/lsm/lsm_tree.cc:2110
      #17 0x0000000000b5fc33 in magma::LSMTree::swapTreeSnapshotAndRecomputeHistory(std::shared_ptr<magma::TreeSnapshot>) ()
          at /opt/gcc-10.2.0/include/c++/10.2.0/ext/atomicity.h:100
      #18 0x0000000000b5ffc0 in magma::LSMTree::resetAppendWriter() () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/lsm/lsm_tree.h:641
      #19 0x0000000000b6018d in magma::LSMTree::resetTree (this=0x7f0c831fe009, this@entry=0x7f0c831fe010)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/lsm/lsm_tree.cc:330
      #20 0x0000000000b6e257 in magma::LSMTree::Close() () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/lsm/lsm_tree.cc:305
      ---Type <return> to continue, or q <return> to quit---
       
       
       
      #21 0x0000000000b6ea25 in magma::LSMTree::~LSMTree (this=0x7f0c831fe010, __in_chrg=<optimized out>)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/lsm/lsm_tree.cc:292
      #22 0x00000000005b0e9a in _M_release (this=0x7f0c831fe000) at /opt/gcc-10.2.0/include/c++/10.2.0/ext/atomicity.h:70
       
      #23 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f0c831fe000)
          at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:151
      #24 0x0000000000ae36c9 in ~__shared_count (this=<optimized out>, __in_chrg=<optimized out>)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:808
      #25 ~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:1183
      #26 reset (this=0x7f0c8309f218) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:1301
      #27 magma::KVStore::~KVStore() () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:808
      #28 0x00000000005b0e9a in _M_release (this=0x7f0c8309f000) at /opt/gcc-10.2.0/include/c++/10.2.0/ext/atomicity.h:70
      #29 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f0c8309f000)
          at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:151
      #30 0x0000000000ab7b77 in ~__shared_count (this=0x7f0cdbff4598, __in_chrg=<optimized out>)
          at /opt/gcc-10.2.0/include/c++/10.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h:779
      #31 ~__shared_ptr (this=0x7f0cdbff4590, __in_chrg=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:1183
      #32 ~shared_ptr (this=0x7f0cdbff4590, __in_chrg=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr.h:121
      #33 magma::KVStoreSet::KVStoreInstance::Destroy(unsigned int, std::function<void ()>) ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore_set.cc:235
      #34 0x0000000000ab80de in magma::KVStoreSet::DestroyKVStore(unsigned short, unsigned int) () at /opt/gcc-10.2.0/include/c++/10.2.0/new:175
      #35 0x0000000000a73571 in magma::Magma::Impl::DeleteKVStore(unsigned short, unsigned int) ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:354
      #36 0x0000000000a73920 in magma::Magma::DeleteKVStore (this=<optimized out>, kvID=kvID@entry=994, kvsRev=kvsRev@entry=1)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:371
      #37 0x00000000009c6f51 in MagmaMemoryTrackingProxy::DeleteKVStore(unsigned short, unsigned int) ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/kvstore/magma-kvstore/magma-memory-tracking-proxy.cc:215
      #38 0x00000000009a1261 in MagmaKVStore::delVBucket(Vbid, std::unique_ptr<KVStoreRevision, std::default_delete<KVStoreRevision> >) ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/include/memcached/vbucket.h:62
       
      #39 0x00000000008236d5 in VBucketMemoryAndDiskDeletionTask::run() ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/vbucket.h:433
      #40 0x0000000000bc8f8d in GlobalTask::execute(std::basic_string_view<char, std::char_traits<char> >) ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/globaltask.cc:104
       
       
       
      #41 0x0000000000bc71ba in operator() (__closure=0x7f0cdbff4e00)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:309
      #42 folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}>(folly::detail::function::Data&) (p=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/server_build/tlm/deps/folly.exploded/include/folly/Function.h:363
      #43 0x0000000000bca116 in operator() (this=0x7f0cdbff4e00)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/cancellable_cpu_executor.cc:42
      #44 CancellableCPUExecutor::add(GlobalTask*, folly::Function<void ()>)::{lambda()#1}::operator()() const ()
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/cancellable_cpu_executor.cc:42
      ---Type <return> to continue, or q <return> to quit---
      #45 0x0000000000d2f0af in operator() (this=0x7f0cdbff50b0)
          at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/ThreadPoolExecutor.cpp:98
      #46 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ()
          at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/ThreadPoolExecutor.cpp:98
      #47 0x0000000000d17da7 in folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>) ()
          at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/CPUThreadPoolExecutor.cpp:306
      #48 0x0000000000d31c4a in __invoke_impl<void, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__f=<optimized out>, __t=<optimized out>, __f=<optimized out>,
          __t=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/ext/atomicity.h:100
      #49 __invoke<void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__fn=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:95
      #50 __call<void, 0, 1> (__args=<optimized out>, this=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/functional:416
      #51 operator()<> (this=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/functional:499
      #52 folly::detail::function::FunctionTraits<void ()>::callSmall<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) (p=...)
          at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly-prefix/src/folly/folly/Function.h:363
      #53 0x0000000000bc3f54 in operator() (this=0x7f0d401e0300)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:49
      #54 operator() (__closure=0x7f0d401e0300) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:49
      #55 folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) (p=...)
          at /home/couchbase/jenkins/workspace/couchbase-server-unix/server_build/tlm/deps/folly.exploded/include/folly/Function.h:377
      #56 0x00007f0d41d36d40 in execute_native_thread_routine () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/src/c++11/thread.cc:80
      #57 0x00007f0d42738ea5 in start_thread () from /lib64/libpthread.so.0
      #58 0x00007f0d4147fb0d in clone () from /lib64/libc.so.6
      

      bt full
      https://gist.github.com/ankushsharma29/6bb59146faa83bb3f2e5f1bbd8103273

      This is the first longevity run we had on 7.6. We don't really have a baseline yet. So, marking it as "No" for regression.

      Attachments

        Issue Links

          Activity

            People

              ankush.sharma Ankush Sharma
              ankush.sharma Ankush Sharma
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty