Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
Untriaged
-
1
-
Unknown
-
KV March-22
Description
Summary
During Magma testing by a customer, a crash was observed during KVBucket::deleteVBucket when attempting to dereference a null VBucketPtr.
Details
This appears to be a latent race condition, where the function performs an initial initial check for a null pointer (and returns early), but we do not check again after acquiring an exclusively locked VBucketPtr. This means that if another thread also is processing a DEL_VBUCKET command, then the state could change during the function's execution:
cb::engine_errc KVBucket::deleteVBucket(Vbid vbid, const CookieIface* c) { |
// Lock to prevent a race condition between a failed update and add |
// (and delete). |
VBucketPtr vb = vbMap.getBucket(vbid);
|
if (!vb) { |
return cb::engine_errc::not_my_vbucket; |
}
|
|
{
|
std::unique_lock<std::mutex> vbSetLh(vbsetMutex);
|
// Obtain a locked VBucket to ensure we interlock with other |
// threads that are manipulating the VB (particularly ones which may |
// try and change the disk revision e.g. deleteAll and compaction). |
auto lockedVB = getLockedVBucket(vbid);
|
vbMap.setState(*lockedVB, vbucket_state_dead, nullptr);
|
getRWUnderlying(vbid)->abortCompactionIfRunning(lockedVB.getLock(),
|
vbid);
|
engine.getDcpConnMap().vbucketStateChanged(vbid, vbucket_state_dead);
|
|
// Drop the VB to begin the delete, the last holder of the VB will |
// unknowingly trigger the destructor which schedules a deletion task. |
vbMap.dropVBucketAndSetupDeferredDeletion(vbid, c);
|
}
|
...
|
Note that we check for a null VBucketPtr initially at lines 1138-1133, however we do not re-check after taking a locked copy of the VBucketPtr at line 1140.
Steps to reproduce
- Start flushing a given vBucket to disk. This needs to be "slow" - i.e. not finish until step 3 has occurred.
- Issue a DEL_VBUCKET request - this will be blocked waiting to acquire the vb_mutexes element for this vBucket.
- Issue a second DEL_VBUCKET request (for the same vbid) - this will also be blocked waiting on the same mutex.
- Allow VBucket flush to complete.
After (4), the first DEL_VBUCKET request can run, and should successfully delete the vBucket. However then that finishes, the second DEL_VBUCKET request is allowed to run, which attempts to delete a VBucketPtr which has already been set to null -> crash.
Note that steps (2) and (3) can be triggered in a full cluster setup by ns_server attempting to delete a VBucket (e.g. as a result of rebalance-out), if the first DEL_VBUCKET call takes longer than ns_server's timeout (value TBD) - this causes ns_server to retry the operation and issue the second DEL_VBUCKET.