Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-19278

ep-engine: potential deadlock due to lock order inversion on ActiveStream::streamMutex

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      ThreadSanitizer has identified a potential deadlock due to a cycle in the lock order graph:

        Cycle in lock order graph:
       
        M43515           => M36787       => M36848               => M43515
       [ActiveStream::     [TaskQueue::    [ExecutorThread::       [ActiveStream::
        streamMutex]        mutex]          currentTaskMutex]       streamMutex]
      

      The crux of the problem appears to be the acquisition of streamMutex in the destructor of ActiveStream. This is ultimately a Bad Idea - if you still have multiple threads accessing an object when it's been deleted then you are already into undefined behaviour:

        Mutex M43515 acquired here while holding mutex M36848 in thread T31:
          #0 pthread_mutex_lock <null> (engine_testapp+0x000000474420)
          #1 cb_mutex_enter /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:85 (libplatform.so.0.1.0+0x000000003820)
          #2 Mutex::acquire() /home/daver/repos/couchbase/server/ep-engine/src/mutex.cc:31 (ep.so+0x0000001e5ace)
          #3 LockHolder::lock() /home/daver/repos/couchbase/server/ep-engine/src/locks.h:71 (ep.so+0x000000084053)
          #4 LockHolder /home/daver/repos/couchbase/server/ep-engine/src/locks.h:48 (ep.so+0x000000083cc2)
          #5 ~ActiveStream /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.h:177 (ep.so+0x0000002a62d4)
          #6 ~ActiveStream /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.h:176 (ep.so+0x0000002a6417)
          #7 ~RCPtr /home/daver/repos/couchbase/server/ep-engine/src/atomic.h:348 (ep.so+0x000000286647)
          #8 ~DCPBackfill /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:114 (ep.so+0x0000002a69dd)
          #9 ~DCPBackfill /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:114 (ep.so+0x0000002a6a47)
          #10 SingleThreadedRCPtr<GlobalTask>::swap(GlobalTask*) /home/daver/repos/couchbase/server/ep-engine/src/atomic.h:483 (ep.so+0x000000111db5)
          #11 SingleThreadedRCPtr<GlobalTask>::reset(GlobalTask*) /home/daver/repos/couchbase/server/ep-engine/src/atomic.h:438 (ep.so+0x0000001e8930)
          #12 ExecutorThread::run() /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:75 (ep.so+0x0000001e6a58)
          #13 launch_executor_thread(void*) /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e65ca)
          #14 platform_thread_wrap /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)
      

      Note: This issue is related to MB-16949, which also exposed the ActiveStream lock-order issue, however a slightly different issue was exposed there due to the differences in the DCP management code.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-19278
          # Subject Branch Project Status CR V

          Activity

            People

              drigby Dave Rigby (Inactive)
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty