Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.5.0
-
Triaged
-
Yes
-
KV-Engine Mad-Hatter GA
Description
As seen during the kv-engine-post-commit-Tsan job for patch http://review.couchbase.org/#/c/116358/ (MB-36372), there is a lock-order-inversion between VBucket::stateLock and ConnMap::connsLock - they are acquired in different orders in KVBucket::setVBucketState and DcpConnMap::disconnect:
ThreadSanitizer: lock-order-inversion (potential deadlock)(install/bin/../lib/libtsan.so.0+0x5b63d)AnnotateRWLockAcquired
|
|
Cycle in lock order graph: M454999369234753584 (0x000000000000) => M494281 (VBucket::stateLock) => M454999369234753584
|
|
Mutex M494281 (VBucket::stateLock) acquired here while holding mutex M454999369234753584 in thread T6:
|
#0 AnnotateRWLockAcquired (libtsan.so.0+0x00000005b63d)
|
...
|
#6 ActiveStream::setDead(end_stream_status_t) kv_engine/engines/ep/src/dcp/active_stream.cc:1256 (ep.so+0x0000000ac6b4)
|
...
|
#9 DcpProducer::setDisconnect() kv_engine/engines/ep/src/dcp/producer.cc:1581 (ep.so+0x00000010af03)
|
#10 DcpConnMap::disconnect(void const*) kv_engine/engines/ep/src/dcp/dcpconnmap.cc:330 (ep.so+0x0000000dc6d1)
|
#11 EventuallyPersistentEngine::handleDisconnect(void const*) kv_engine/engines/ep/src/ep_engine.cc:6174 (ep.so+0x00000017562b)
|
...
|
|
Mutex M454999369234753584 (ConnMap::connsLock) previously acquired by the same thread here:
|
#0 pthread_mutex_lock (libtsan.so.0+0x00000003876f)
|
#1 __gthread_mutex_lock /usr/local/include/c++/7.3.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748 (memcached+0x00000043b06f)
|
#2 std::mutex::lock() /usr/local/include/c++/7.3.0/bits/std_mutex.h:103 (memcached+0x00000043b06f)
|
#3 std::lock_guard::lock_guard(std::mutex&) /usr/local/include/c++/7.3.0/bits/std_mutex.h:162 (ep.so+0x0000000dc3b5)
|
#4 DcpConnMap::disconnect(void const*) kv_engine/engines/ep/src/dcp/dcpconnmap.cc:316 (ep.so+0x0000000dc3b5)
|
#5 EventuallyPersistentEngine::handleDisconnect(void const*) kv_engine/engines/ep/src/ep_engine.cc:6174 (ep.so+0x00000017562b)
|
...
|
|
Mutex M454999369234753584 (ConnMap::connsLock) acquired here while holding mutex M494281 in thread T8:
|
#0 pthread_mutex_lock (libtsan.so.0+0x00000003876f)
|
...
|
#4 DcpConnMap::vbucketStateChanged(Vbid, vbucket_state_t, bool) kv_engine/engines/ep/src/dcp/dcpconnmap.cc:240 (ep.so+0x0000000d7b7e)
|
#5 KVBucket::setVBucketState_UNLOCKED(std::shared_ptr&, vbucket_state_t, nlohmann::basic_json, std::allocator >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer> const&, TransferVB, bool, std::unique_lock&, folly::SharedMutexImpl::WriteHolder&) kv_engine/engines/ep/src/kv_bucket.cc:910 (ep.so+0x0000002200e8)
|
#6 KVBucket::setVBucketState(Vbid, vbucket_state_t, nlohmann::basic_json, std::allocator >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer> const&, TransferVB, void const*) kv_engine/engines/ep/src/kv_bucket.cc:857 (ep.so+0x000000220c72)
|
#7 EventuallyPersistentEngine::setVBucketState(...) kv_engine/engines/ep/src/ep_engine.cc:6505 (ep.so+0x000000175932)
|
...
|
|
Mutex M494281 (VBucket::stateLock) previously acquired by the same thread here:
|
#0 AnnotateRWLockAcquired (libtsan.so.0+0x00000005b63d)
|
...
|
#6 KVBucket::setVBucketState(Vbid, vbucket_state_t, nlohmann::basic_json, std::allocator >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer> const&, TransferVB, void const*) kv_engine/engines/ep/src/kv_bucket.cc:856 (ep.so+0x000000220c41)
|
#7 EventuallyPersistentEngine::setVBucketState(...) kv_engine/engines/ep/src/ep_engine.cc:6505 (ep.so+0x000000175932)
|
...
|
Link to TSan report: http://cv.jenkins.couchbase.com/job/kv_engine-master-post-commit-TSan/660/ThreadSanitizer/type.2130731106/
Note there's other TSan reported issues there, but they don't seem directly related; however subsequent builds also report the above error so pretty confident the aforementioned patch is the cause of this problem.
Attachments
Issue Links
- relates to
-
MB-36372 After 2 nodes rebalance in all the InsertRequest are getting RequestTimeoutException
- Closed
For Gerrit Dashboard: MB-36557 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
116860,4 | MB-36557: | master | kv_engine | Status: ABANDONED | 0 | -1 |
117017,9 | MB-36557: Avoid lock-inversion at set-vbstate and conn-disconnect | mad-hatter | kv_engine | Status: MERGED | +2 | +1 |
117128,1 | Merge remote-tracking branch 'couchbase/mad-hatter' | master | kv_engine | Status: MERGED | +2 | +1 |