Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35458

lock-order-inversion when handling seqno-ack (between StreamContainer and streamMutex)

    XMLWordPrintable

Details

    • Untriaged
    • No
    • KV-Engine MH 2nd Beta

    Description

      As seen during 4 node cluster_run (2 replicas), running pillowfight with SyncWrites:

      cbc-pillowfight -U localhost:9000 -u Administrator -P asdasd --durability=majority -I 10000
      

      Lock-order inversion between StreamContainer and streamMutex:

      WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=3579)
        Cycle in lock order graph: M59245362996872056 (0x000000000000) => M62622787839639976 (0x000000000000) => M59245362996872056
       
        Mutex M62622787839639976 acquired here while holding mutex M59245362996872056 in thread T15:
          #0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
          ...
          #6 StreamContainer<std::shared_ptr<Stream> >::rlock() const kv_engine/engines/ep/src/dcp/stream_container.h:273 (ep.so+0x000000115be4)
          #7 DcpProducer::notifySeqnoAvailable(Vbid, unsigned long) kv_engine/engines/ep/src/dcp/producer.cc:1397 (ep.so+0x000000115be4)
          #8 DcpConnMap::notifyVBConnections(Vbid, unsigned long) kv_engine/engines/ep/src/dcp/dcpconnmap.cc:428 (ep.so+0x0000000e7d2b)
          #9 KVBucket::notifyReplication(Vbid, long) kv_engine/engines/ep/src/kv_bucket.cc:2592 (ep.so+0x00000020bbff)
          #10 EPBucket::notifyNewSeqno(Vbid, VBNotifyCtx const&) kv_engine/engines/ep/src/ep_bucket.cc:1327 (ep.so+0x00000015b999)
          #11 NotifyNewSeqnoCB::callback(Vbid const&, VBNotifyCtx const&) kv_engine/engines/ep/src/kv_bucket.h:838 (ep.so+0x000000223cd1)
          #12 VBucket::notifyNewSeqno(VBNotifyCtx const&) kv_engine/engines/ep/src/vbucket.cc:3627 (ep.so+0x000000286df5)
          #13 VBucket::commit(...) kv_engine/engines/ep/src/vbucket.cc:855 (ep.so+0x000000286df5)
          #14 ActiveDurabilityMonitor::commit(DurabilityMonitor::SyncWrite const&) kv_engine/engines/ep/src/durability/active_durability_monitor.cc:990 (ep.so+0x0000001448ee)
          #15 ActiveDurabilityMonitor::processCompletedSyncWriteQueue() kv_engine/engines/ep/src/durability/active_durability_monitor.cc:659 (ep.so+0x000000145d41)
          #16 ActiveDurabilityMonitor::seqnoAckReceived(...) kv_engine/engines/ep/src/durability/active_durability_monitor.cc:546 (ep.so+0x000000146c23)
          #17 VBucket::seqnoAcknowledged(...) kv_engine/engines/ep/src/vbucket.cc:3832 (ep.so+0x00000026bfa2)
          #18 ActiveStream::seqnoAck(...) kv_engine/engines/ep/src/dcp/active_stream.cc:1774 (ep.so+0x0000000ab68c)
          #19 DcpProducer::seqno_acknowledged(unsigned int, Vbid, unsigned long) kv_engine/engines/ep/src/dcp/producer.cc:1078 (ep.so+0x00000012456a)
          ...
       
        Mutex M59245362996872056 previously acquired by the same thread here:
          #0 pthread_mutex_lock <null> (libtsan.so.0+0x00000003bbbf)
          #1 __gthread_mutex_lock /usr/local/include/c++/7.3.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748 (ep.so+0x0000000ab543)
          #2 std::mutex::lock() /usr/local/include/c++/7.3.0/bits/std_mutex.h:103 (ep.so+0x0000000ab543)
          #3 std::lock_guard<std::mutex>::lock_guard(std::mutex&) /usr/local/include/c++/7.3.0/bits/std_mutex.h:162 (ep.so+0x0000000ab543)
          #4 ActiveStream::seqnoAck(...) kv_engine/engines/ep/src/dcp/active_stream.cc:1766 (ep.so+0x0000000ab543)
          #5 DcpProducer::seqno_acknowledged(unsigned int, Vbid, unsigned long) kv_engine/engines/ep/src/dcp/producer.cc:1078 (ep.so+0x00000012456a)
          ...
       
        Mutex M59245362996872056 acquired here while holding mutex M62622787839639976 in thread T15:
          #0 pthread_mutex_lock <null> (libtsan.so.0+0x00000003bbbf)
          #1 __gthread_mutex_lock /usr/local/include/c++/7.3.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748 (ep.so+0x0000000b4f76)
          #2 std::mutex::lock() /usr/local/include/c++/7.3.0/bits/std_mutex.h:103 (ep.so+0x0000000b4f76)
          #3 std::lock_guard<std::mutex>::lock_guard(std::mutex&) /usr/local/include/c++/7.3.0/bits/std_mutex.h:162 (ep.so+0x0000000b4f76)
          #4 ActiveStream::next() kv_engine/engines/ep/src/dcp/active_stream.cc:146 (ep.so+0x0000000b4f76)
          #5 DcpProducer::getNextItem() kv_engine/engines/ep/src/dcp/producer.cc:1545 (ep.so+0x000000117dfb)
          #6 DcpProducer::step(dcp_message_producers*) kv_engine/engines/ep/src/dcp/producer.cc:599 (ep.so+0x0000001232b7)
          ...
       
        Mutex M62622787839639976 previously acquired by the same thread here:
          #0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
          ...
          #4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&) /usr/local/include/c++/7.3.0/shared_mutex:553 (ep.so+0x000000117d59)
          #5 StreamContainer<std::shared_ptr<Stream> >::ResumableIterationHandle::ResumableIterationHandle(StreamContainer<std::shared_ptr<Stream> >&) kv_engine/engines/ep/src/dcp/stream_container.h:119 (ep.so+0x000000117d59)
          #6 StreamContainer<std::shared_ptr<Stream> >::startResumable() kv_engine/engines/ep/src/dcp/stream_container.h:269 (ep.so+0x000000117d59)
          #7 DcpProducer::getNextItem() kv_engine/engines/ep/src/dcp/producer.cc:1539 (ep.so+0x000000117d59)
          #8 DcpProducer::step(dcp_message_producers*) kv_engine/engines/ep/src/dcp/producer.cc:599 (ep.so+0x0000001232b7)
          ...
       
      SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) (install/bin/../lib/libtsan.so.0+0x2c98b) in __interceptor_pthread_rwlock_rdlock
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              drigby Dave Rigby (Inactive)
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty