Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35421

ThreadSanitizer: lock-order-inversion between StreamContainer -> Stream::streamMutex w/ SyncWrites

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • 6.5.0
    • couchbase-bucket
    • None
    • Triaged
    • No

    Description

      Summary

      TSan reports the below error on a 4-node cluster run (2 replicas) when issuing SyncWrites in a loop.

      Note this is similar to the deadlock in MB-35420 but this one seems simpler as only involves two mutexes instead of three.

      Steps to Reproduce

      1. Build with ThreadSanitizer:

        make -j8 EXTRA_CMAKE_OPTIONS="-D CB_THREADSANITIZER=1"
        

      2. Start cluster with 4 nodes, 2 replicas with TSan set to log:

        TSAN_OPTIONS="log_path=sanitizers.log second_deadlock_stack=1" ./cluster_run --nodes 4
        ./cluster_connect -n4 -r2
        

      3. Direct SyncWrites in a loop at vBucket:0

        kv_engine/engines/ep/management$ ./sync_repl.py 127.0.0.1:12000 Administrator asdasd default loop_setD key value 1000000 1 
        

      Expected Results
      Empty tsan.log files (no TSan issues reported).

      Actual Results
      Deadlock reported:

      WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=5841)
        Cycle in lock order graph: M738725681898217512 (0x000000000000) => M730000232498396792 (0x000000000000) => M738725681898217512
       
        Mutex M730000232498396792 acquired here while holding mutex M738725681898217512 in thread T36:
          #0 pthread_mutex_lock <null> (libtsan.so.0+0x00000003bbbf)
          #1 __gthread_mutex_lock /usr/local/include/c++/7.3.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748 (ep.so+0x0000000bf33b)
          #2 std::mutex::lock() /usr/local/include/c++/7.3.0/bits/std_mutex.h:103 (ep.so+0x0000000bf33b)
          #3 std::lock_guard<std::mutex>::lock_guard(std::mutex&) /usr/local/include/c++/7.3.0/bits/std_mutex.h:162 (ep.so+0x0000000bf33b)
          #4 ActiveStream::nextCheckpointItemTask() kv_engine/engines/ep/src/dcp/active_stream.cc:817 (ep.so+0x0000000bf33b)
          #5 ActiveStreamCheckpointProcessorTask::run() kv_engine/engines/ep/src/dcp/active_stream_checkpoint_processor_task.cc:56 (ep.so+0x0000000d0490)
          #6 ExecutorThread::run() kv_engine/engines/ep/src/executorthread.cc:153 (ep.so+0x0000001e1775)
          #7 launch_executor_thread kv_engine/engines/ep/src/executorthread.cc:34 (ep.so+0x0000001e1df6)
          #8 CouchbaseThread::run() platform/src/cb_pthreads.cc:58 (libplatform_so.so.0.1.0+0x00000000a217)
          #9 platform_thread_wrap platform/src/cb_pthreads.cc:71 (libplatform_so.so.0.1.0+0x00000000a217)
          #10 <null> <null> (libtsan.so.0+0x00000002843b)
       
        Mutex M738725681898217512 previously acquired by the same thread here:
          #0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
          #1 cb_rw_reader_enter(pthread_rwlock_t*) platform/src/cb_pthreads.cc:195 (libplatform_so.so.0.1.0+0x00000000a43a)
          #2 cb::RWLock::readerLock() platform/include/platform/rwlock.h:87 (ep.so+0x0000000d030d)
          #3 cb::RWLock::lock_shared() platform/include/platform/rwlock.h:67 (ep.so+0x0000000d030d)
          #4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&) /usr/local/include/c++/7.3.0/shared_mutex:553 (ep.so+0x0000000d030d)
          #5 StreamContainer<std::shared_ptr<Stream> >::ReadLockedHandle::ReadLockedHandle(StreamContainer<std::shared_ptr<Stream> > const&) kv_engine/engines/ep/src/dcp/stream_container.h:213 (ep.so+0x0000000d030d)
          #6 StreamContainer<std::shared_ptr<Stream> >::rlock() const kv_engine/engines/ep/src/dcp/stream_container.h:273 (ep.so+0x0000000d030d)
          #7 ActiveStreamCheckpointProcessorTask::run() kv_engine/engines/ep/src/dcp/active_stream_checkpoint_processor_task.cc:54 (ep.so+0x0000000d030d)
          #8 ExecutorThread::run() kv_engine/engines/ep/src/executorthread.cc:153 (ep.so+0x0000001e1775)
          #9 launch_executor_thread kv_engine/engines/ep/src/executorthread.cc:34 (ep.so+0x0000001e1df6)
          #10 CouchbaseThread::run() platform/src/cb_pthreads.cc:58 (libplatform_so.so.0.1.0+0x00000000a217)
          #11 platform_thread_wrap platform/src/cb_pthreads.cc:71 (libplatform_so.so.0.1.0+0x00000000a217)
          #12 <null> <null> (libtsan.so.0+0x00000002843b)
       
        Mutex M738725681898217512 acquired here while holding mutex M730000232498396792 in thread T7:
          #0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
          #1 cb_rw_reader_enter(pthread_rwlock_t*) platform/src/cb_pthreads.cc:195 (libplatform_so.so.0.1.0+0x00000000a43a)
          #2 cb::RWLock::readerLock() platform/include/platform/rwlock.h:87 (ep.so+0x000000131230)
          #3 cb::RWLock::lock_shared() platform/include/platform/rwlock.h:67 (ep.so+0x000000131230)
          #4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&) /usr/local/include/c++/7.3.0/shared_mutex:553 (ep.so+0x000000131230)
          #5 StreamContainer<std::shared_ptr<Stream> >::ReadLockedHandle::ReadLockedHandle(StreamContainer<std::shared_ptr<Stream> > const&) kv_engine/engines/ep/src/dcp/stream_container.h:213 (ep.so+0x000000131230)
          #6 StreamContainer<std::shared_ptr<Stream> >::rlock() const kv_engine/engines/ep/src/dcp/stream_container.h:273 (ep.so+0x000000121c1b)
          #7 DcpProducer::notifySeqnoAvailable(Vbid, unsigned long) kv_engine/engines/ep/src/dcp/producer.cc:1397 (ep.so+0x000000121c1b)
          #8 DcpConnMap::notifyVBConnections(Vbid, unsigned long) kv_engine/engines/ep/src/dcp/dcpconnmap.cc:428 (ep.so+0x0000000f9eb1)
          #9 KVBucket::notifyReplication(Vbid, long) kv_engine/engines/ep/src/kv_bucket.cc:2588 (ep.so+0x000000213a0d)
          #10 EPBucket::notifyNewSeqno(Vbid, VBNotifyCtx const&) kv_engine/engines/ep/src/ep_bucket.cc:1327 (ep.so+0x000000162ebb)
          #11 NotifyNewSeqnoCB::callback(Vbid const&, VBNotifyCtx const&) kv_engine/engines/ep/src/kv_bucket.h:838 (ep.so+0x000000229ab7)
          #12 VBucket::notifyNewSeqno(VBNotifyCtx const&) kv_engine/engines/ep/src/vbucket.cc:3627 (ep.so+0x00000027179d)
          #13 VBucket::commit(DocKey const&, unsigned long, boost::optional<long>, Collections::VB::Manifest::CachingReadHandle const&, void const*) kv_engine/engines/ep/src/vbucket.cc:855 (ep.so+0x00000028b284)
          #14 ActiveDurabilityMonitor::commit(DurabilityMonitor::SyncWrite const&) kv_engine/engines/ep/src/durability/active_durability_monitor.cc:984 (ep.so+0x00000014bf3f)
          #15 ActiveDurabilityMonitor::processCompletedSyncWriteQueue() kv_engine/engines/ep/src/durability/active_durability_monitor.cc:653 (ep.so+0x00000014d496)
          #16 ActiveDurabilityMonitor::seqnoAckReceived(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) kv_engine/engines/ep/src/durability/active_durability_monitor.cc:540 (ep.so+0x00000014e116)
          #17 VBucket::seqnoAcknowledged(folly::SharedMutexImpl<false, void, std::atomic, false, true>::ReadHolder const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long) kv_engine/engines/ep/src/vbucket.cc:3832 (ep.so+0x000000271b6e)
          #18 ActiveStream::seqnoAck(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long) kv_engine/engines/ep/src/dcp/active_stream.cc:1774 (ep.so+0x0000000b69b5)
          #19 DcpProducer::seqno_acknowledged(unsigned int, Vbid, unsigned long) kv_engine/engines/ep/src/dcp/producer.cc:1078 (ep.so+0x00000012d9d8)
          #20 EventuallyPersistentEngine::seqno_acknowledged(gsl::not_null<void const*>, unsigned int, Vbid, unsigned long) kv_engine/engines/ep/src/ep_engine.cc:1583 (ep.so+0x00000018ed0a)
          #21 dcpSeqnoAcknowledged(Cookie&, unsigned int, Vbid, unsigned long) kv_engine/daemon/protocol/mcbp/engine_wrapper.cc:749 (memcached+0x00000048eddd)
          #22 dcp_seqno_acknowledged_executor(Cookie&) kv_engine/daemon/protocol/mcbp/dcp_seqno_acknowledged_executor.cc:33 (memcached+0x000000533e02)
          #23 std::_Function_handler<void (Cookie&), void (*)(Cookie&)>::_M_invoke(std::_Any_data const&, Cookie&) /usr/local/include/c++/7.3.0/bits/std_function.h:316 (memcached+0x00000051fffd)
          #24 std::function<void (Cookie&)>::operator()(Cookie&) const /usr/local/include/c++/7.3.0/bits/std_function.h:706 (memcached+0x0000005203a3)
          #25 execute_client_request_packet(Cookie&, cb::mcbp::Request const&) kv_engine/daemon/mcbp_executors.cc:850 (memcached+0x00000051f8c2)
          #26 execute_request_packet(Cookie&, cb::mcbp::Request const&) kv_engine/daemon/mcbp_executors.cc:874 (memcached+0x00000051f9ea)
          #27 Cookie::execute() kv_engine/daemon/cookie.cc:124 (memcached+0x000000504438)
          #28 StateMachine::conn_execute() kv_engine/daemon/statemachine.cc:410 (memcached+0x0000005530c5)
          #29 StateMachine::execute() kv_engine/daemon/statemachine.cc:137 (memcached+0x000000554eea)
          #30 Connection::runStateMachinery() kv_engine/daemon/connection.cc:1340 (memcached+0x0000004ec21b)
          #31 Connection::runEventLoop(short) kv_engine/daemon/connection.cc:1414 (memcached+0x0000004ec2aa)
          #32 run_event_loop(Connection*, short) kv_engine/daemon/connections.cc:148 (memcached+0x000000502340)
          #33 event_handler(int, short, void*) kv_engine/daemon/memcached.cc:848 (memcached+0x00000043d8a8)
          #34 event_persist_closure /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/libevent/libevent-prefix/src/libevent/event.c:1580 (libevent_core.so.2.1.8+0x000000017856)
          #35 event_process_active_single_queue /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/libevent/libevent-prefix/src/libevent/event.c:1639 (libevent_core.so.2.1.8+0x000000017856)
          #36 CouchbaseThread::run() platform/src/cb_pthreads.cc:58 (libplatform_so.so.0.1.0+0x00000000a217)
          #37 platform_thread_wrap platform/src/cb_pthreads.cc:71 (libplatform_so.so.0.1.0+0x00000000a217)
          #38 <null> <null> (libtsan.so.0+0x00000002843b)
       
        Mutex M730000232498396792 previously acquired by the same thread here:
          #0 pthread_mutex_lock <null> (libtsan.so.0+0x00000003bbbf)
          #1 __gthread_mutex_lock /usr/local/include/c++/7.3.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748 (ep.so+0x0000000b6913)
          #2 std::mutex::lock() /usr/local/include/c++/7.3.0/bits/std_mutex.h:103 (ep.so+0x0000000b6913)
          #3 std::lock_guard<std::mutex>::lock_guard(std::mutex&) /usr/local/include/c++/7.3.0/bits/std_mutex.h:162 (ep.so+0x0000000b6913)
          #4 ActiveStream::seqnoAck(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long) kv_engine/engines/ep/src/dcp/active_stream.cc:1766 (ep.so+0x0000000b6913)
          #5 DcpProducer::seqno_acknowledged(unsigned int, Vbid, unsigned long) kv_engine/engines/ep/src/dcp/producer.cc:1078 (ep.so+0x00000012d9d8)
          #6 EventuallyPersistentEngine::seqno_acknowledged(gsl::not_null<void const*>, unsigned int, Vbid, unsigned long) kv_engine/engines/ep/src/ep_engine.cc:1583 (ep.so+0x00000018ed0a)
          #7 dcpSeqnoAcknowledged(Cookie&, unsigned int, Vbid, unsigned long) kv_engine/daemon/protocol/mcbp/engine_wrapper.cc:749 (memcached+0x00000048eddd)
          #8 dcp_seqno_acknowledged_executor(Cookie&) kv_engine/daemon/protocol/mcbp/dcp_seqno_acknowledged_executor.cc:33 (memcached+0x000000533e02)
          #9 std::_Function_handler<void (Cookie&), void (*)(Cookie&)>::_M_invoke(std::_Any_data const&, Cookie&) /usr/local/include/c++/7.3.0/bits/std_function.h:316 (memcached+0x00000051fffd)
          #10 std::function<void (Cookie&)>::operator()(Cookie&) const /usr/local/include/c++/7.3.0/bits/std_function.h:706 (memcached+0x0000005203a3)
          #11 execute_client_request_packet(Cookie&, cb::mcbp::Request const&) kv_engine/daemon/mcbp_executors.cc:850 (memcached+0x00000051f8c2)
          #12 execute_request_packet(Cookie&, cb::mcbp::Request const&) kv_engine/daemon/mcbp_executors.cc:874 (memcached+0x00000051f9ea)
          #13 Cookie::execute() kv_engine/daemon/cookie.cc:124 (memcached+0x000000504438)
          #14 StateMachine::conn_execute() kv_engine/daemon/statemachine.cc:410 (memcached+0x0000005530c5)
          #15 StateMachine::execute() kv_engine/daemon/statemachine.cc:137 (memcached+0x000000554eea)
          #16 Connection::runStateMachinery() kv_engine/daemon/connection.cc:1340 (memcached+0x0000004ec21b)
          #17 Connection::runEventLoop(short) kv_engine/daemon/connection.cc:1414 (memcached+0x0000004ec2aa)
          #18 run_event_loop(Connection*, short) kv_engine/daemon/connections.cc:148 (memcached+0x000000502340)
          #19 event_handler(int, short, void*) kv_engine/daemon/memcached.cc:848 (memcached+0x00000043d8a8)
          #20 event_persist_closure /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/libevent/libevent-prefix/src/libevent/event.c:1580 (libevent_core.so.2.1.8+0x000000017856)
          #21 event_process_active_single_queue /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/libevent/libevent-prefix/src/libevent/event.c:1639 (libevent_core.so.2.1.8+0x000000017856)
          #22 CouchbaseThread::run() platform/src/cb_pthreads.cc:58 (libplatform_so.so.0.1.0+0x00000000a217)
          #23 platform_thread_wrap platform/src/cb_pthreads.cc:71 (libplatform_so.so.0.1.0+0x00000000a217)
          #24 <null> <null> (libtsan.so.0+0x00000002843b)
       
        Thread T36 'mc:auxIO_0' (tid=6082, running) created by thread T27 at:
          #0 pthread_create <null> (libtsan.so.0+0x00000002b6f0)
          #1 cb_create_named_thread(unsigned long*, void (*)(void*), void*, int, char const*) platform/src/cb_pthreads.cc:109 (libplatform_so.so.0.1.0+0x000000009fc5)
          #2 ExecutorThread::start() kv_engine/engines/ep/src/executorthread.cc:51 (ep.so+0x0000001e0f2d)
          #3 ExecutorPool::_adjustWorkers(task_type_t, unsigned long) kv_engine/engines/ep/src/executorpool.cc:527 (ep.so+0x0000001d6390)
          #4 ExecutorPool::_startWorkers() kv_engine/engines/ep/src/executorpool.cc:597 (ep.so+0x0000001d68a9)
          #5 ExecutorPool::_registerTaskable(Taskable&) kv_engine/engines/ep/src/executorpool.cc:483 (ep.so+0x0000001d4297)
          #6 ExecutorPool::registerTaskable(Taskable&) kv_engine/engines/ep/src/executorpool.cc:488 (ep.so+0x0000001d430c)
          #7 KVBucket::KVBucket(EventuallyPersistentEngine&) kv_engine/engines/ep/src/kv_bucket.cc:315 (ep.so+0x00000021a4d4)
          #8 EPBucket::EPBucket(EventuallyPersistentEngine&) kv_engine/engines/ep/src/ep_bucket.cc:231 (ep.so+0x0000001635e4)
          #9 std::_MakeUniq<EPBucket>::__single_object std::make_unique<EPBucket, EventuallyPersistentEngine&>(EventuallyPersistentEngine&) /usr/local/include/c++/7.3.0/bits/unique_ptr.h:825 (ep.so+0x000000186a70)
          #10 EventuallyPersistentEngine::makeBucket(Configuration&) kv_engine/engines/ep/src/ep_engine.cc:6078 (ep.so+0x000000186a70)
          #11 EventuallyPersistentEngine::initialize(char const*) kv_engine/engines/ep/src/ep_engine.cc:2022 (ep.so+0x0000001927e9)
          #12 CreateBucketThread::create() kv_engine/daemon/memcached.cc:1867 (memcached+0x000000437850)
          #13 CreateBucketThread::run() kv_engine/daemon/memcached.cc:1914 (memcached+0x000000437e1e)
          #14 Couchbase::Thread::thread_entry() platform/src/thread.cc:45 (libplatform_so.so.0.1.0+0x00000001d7e6)
          #15 Couchbase::StartThreadDelegator::run(Couchbase::Thread&) platform/src/thread.cc:59 (libplatform_so.so.0.1.0+0x00000001d8bf)
          #16 task_thread_main platform/src/thread.cc:65 (libplatform_so.so.0.1.0+0x00000001d8bf)
          #17 CouchbaseThread::run() platform/src/cb_pthreads.cc:58 (libplatform_so.so.0.1.0+0x00000000a217)
          #18 platform_thread_wrap platform/src/cb_pthreads.cc:71 (libplatform_so.so.0.1.0+0x00000000a217)
          #19 <null> <null> (libtsan.so.0+0x00000002843b)
       
        Thread T7 'mc:worker_1' (tid=5949, running) created by main thread at:
          #0 pthread_create <null> (libtsan.so.0+0x00000002b6f0)
          #1 cb_create_named_thread(unsigned long*, void (*)(void*), void*, int, char const*) platform/src/cb_pthreads.cc:109 (libplatform_so.so.0.1.0+0x000000009fc5)
          #2 create_worker kv_engine/daemon/thread.cc:111 (memcached+0x0000004b9be8)
          #3 thread_init(unsigned long, event_base*, void (*)(int, short, void*)) kv_engine/daemon/thread.cc:460 (memcached+0x0000004bad7b)
          #4 memcached_main kv_engine/daemon/memcached.cc:2479 (memcached+0x000000442b42)
          #5 main kv_engine/daemon/main.cc:33 (memcached+0x000000433a75)
       
      SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) (install/bin/../lib/libtsan.so.0+0x3bbbf) in pthread_mutex_lock
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              drigby Dave Rigby (Inactive)
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty