Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-16434

[ThreadSanitizer]: Lock order inversion in DCP consumer, causing a potential deadlock

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 3.1.2, 4.1.0, 4.5.0
    • 3.1.0
    • couchbase-bucket
    • Security Level: Public
    • None
    • Untriaged
    • Unknown
    • KV: Oct 4 - Oct 24

    Description

      DCP consumer stream has 2 locks "streamMutex" to guard stream transitions, readyQ and "bufMutex" to guard the consumer buffer.

      The order in which these locks are acquired is inverse across the 2 threads one of which processes an incoming mutation and another which processes close stream

      Below is the log from thread sanitizer.

      WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=12407)
      Cycle in lock order graph: M23927 (0x7d5000016940) => M23926 (0x7d5000016868) => M23927

      Mutex M23926 acquired here while holding mutex M23927 in main thread:
      #0 pthread_mutex_lock <null>:0 (engine_testapp+0x000000074b50)
      #1 cb_mutex_enter /root/manu/platform/src/cb_pthreads.c:115 (libplatform.so.0.1.0+0x000000003e58)
      #2 Mutex::acquire() /root/manu/ep-engine/src/mutex.cc:31 (ep.so+0x000000110499)
      #3 LockHolder::lock() /root/manu/ep-engine/src/locks.h:70 (ep.so+0x000000077f5c)
      #4 PassiveStream::processMutation(MutationResponse*) /root/manu/ep-engine/src/dcp/stream.cc:1310 (ep.so+0x00000007701d)
      #5 PassiveStream::messageReceived(DcpResponse*) /root/manu/ep-engine/src/dcp/stream.cc:1171 (ep.so+0x000000076aaf)
      #6 DcpConsumer::mutation(unsigned int, void const*, unsigned short, void const*, unsigned int, unsigned long, unsigned short, unsigned int, unsigned char, unsigned int, unsigned long, unsigned long, unsigned int, unsigned char, void const*, unsigned short) /root/manu/ep-engine/src/dcp/consumer.cc:279 (ep.so+0x00000005b09d)
      #7 EvpDcpMutation(engine_interface*, void const*, unsigned int, void const*, unsigned short, void const*, unsigned int, unsigned long, unsigned short, unsigned int, unsigned char, unsigned long, unsigned long, unsigned int, unsigned int, void const*, unsigned short, unsigned char) /root/manu/ep-engine/src/ep_engine.cc:1627 (ep.so+0x0000000b3d2a)
      #8 mock_dcp_mutation(engine_interface*, void const*, unsigned int, void const*, unsigned short, void const*, unsigned int, unsigned long, unsigned short, unsigned int, unsigned char, unsigned long, unsigned long, unsigned int, unsigned int, void const*, unsigned short, unsigned char) /root/manu/memcached/programs/engine_testapp/engine_testapp.cc:618 (engine_testapp+0x0000000bb595)
      #9 test_dcp_erroneous_mutations(engine_interface*, engine_interface_v1*) /root/manu/ep-engine/tests/ep_testsuite.cc:10040 (ep_testsuite.so+0x000000089615)
      #10 execute_test(test, char const*, char const*) /root/manu/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b944c)
      #11 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

      Hint: use TSAN_OPTIONS=second_deadlock_stack=1 to get more informative warning message

      Mutex M23927 acquired here while holding mutex M23926 in main thread:
      #0 pthread_mutex_lock <null>:0 (engine_testapp+0x000000074b50)
      #1 cb_mutex_enter /root/manu/platform/src/cb_pthreads.c:115 (libplatform.so.0.1.0+0x000000003e58)
      #2 Mutex::acquire() /root/manu/ep-engine/src/mutex.cc:31 (ep.so+0x000000110499)
      #3 LockHolder::lock() /root/manu/ep-engine/src/locks.h:70 (ep.so+0x000000075a53)
      #4 PassiveStream::setDead_UNLOCKED(end_stream_status_t) /root/manu/ep-engine/src/dcp/stream.cc:1029 (ep.so+0x00000007527a)
      #5 PassiveStream::setDead(end_stream_status_t) /root/manu/ep-engine/src/dcp/stream.cc:1051 (ep.so+0x000000075fba)
      #6 DcpConsumer::closeStream(unsigned int, unsigned short) /root/manu/ep-engine/src/dcp/consumer.cc:187 (ep.so+0x00000005a80c)
      #7 EvpDcpCloseStream(engine_interface*, void const*, unsigned int, unsigned short) /root/manu/ep-engine/src/ep_engine.cc:1519 (ep.so+0x0000000b378f)
      #8 mock_dcp_close_stream(engine_interface*, void const*, unsigned int, unsigned short) /root/manu/memcached/programs/engine_testapp/engine_testapp.cc:532 (engine_testapp+0x0000000bb19d)
      #9 test_dcp_erroneous_mutations(engine_interface*, engine_interface_v1*) /root/manu/ep-engine/tests/ep_testsuite.cc:10078 (ep_testsuite.so+0x00000008997b)
      #10 execute_test(test, char const*, char const*) /root/manu/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b944c)
      #11 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

      SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) ??:0 pthread_mutex_lock

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ericcooper Eric Cooper (Inactive)
              manu Manu Dhundi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty