Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.1.0
-
Triaged
-
1
-
Yes
-
KV 2021-Nov
Description
Introduced in http://review.couchbase.org/c/kv_engine/+/163330
Expelling checks after every vbucket if further reduction in memory usage is required
const auto vbuckets = bucket.getVBuckets().getVBucketsSortedByChkMgrMem(); |
for (const auto& it : vbuckets) { |
const auto vbid = it.first; |
VBucketPtr vb = bucket.getVBucket(vbid);
|
if (!vb) { |
continue; |
}
|
|
const auto expelResult = |
vb->checkpointManager->expelUnreferencedCheckpointItems();
|
EP_LOG_DEBUG(
|
"Expelled {} unreferenced checkpoint items " |
"from {} " |
"and estimated to have recovered {} bytes.", |
expelResult.count,
|
vbid,
|
expelResult.memory);
|
|
if (bucket.getRequiredCheckpointMemoryReduction() == 0) { |
// All done |
return ReductionRequired::No; |
}
|
}
|
size_t KVBucket::getRequiredCheckpointMemoryReduction() const { |
const auto checkpointMemoryRatio = getCheckpointMemoryRatio(); |
const auto checkpointQuota = stats.getMaxDataSize() * checkpointMemoryRatio; |
const auto recoveryThreshold = |
checkpointQuota * getCheckpointMemoryRecoveryUpperMark();
|
const auto usage = stats.getCheckpointManagerEstimatedMemUsage(); |
|
if (usage < recoveryThreshold) { |
return 0; |
}
|
|
const auto lowerRatio = getCheckpointMemoryRecoveryLowerMark(); |
const auto lowerMark = checkpointQuota * lowerRatio; |
Expects(usage > lowerMark);
|
const size_t amountOfMemoryToClear = usage - lowerMark; |
|
const auto toMB = [](size_t bytes) { return bytes / (1024 * 1024); }; |
const auto upperRatio = getCheckpointMemoryRecoveryUpperMark(); |
EP_LOG_DEBUG(
|
"Triggering memory recovery as checkpoint memory usage ({} MB) " |
"exceeds the upper_mark ({}, " |
"{} MB) - total checkpoint quota {}, {} MB . Attempting to free {} " |
"MB of memory.", |
toMB(usage),
|
upperRatio,
|
toMB(checkpointQuota * upperRatio),
|
checkpointMemoryRatio,
|
toMB(checkpointQuota),
|
toMB(amountOfMemoryToClear));
|
|
return amountOfMemoryToClear; |
}
|
getRequiredCheckpointMemoryReduction boils down to:
If checkpoint memory usage exceeds high mark:
|
-> amount of memory to recover to reach the low mark
|
else:
|
-> 0
|
Checking after every vbucket means expelling will often stop slightly below the high mark.
Anecdotally, this has been seen in cluster run to lead to each run of the ClosedUnrefCheckpointRemoverTask expelling from a single vbucket, then ending. This leads to a lot of logging of:
ClosedUnrefCheckpointRemoverTask:0 Triggering checkpoint memory recovery - attempting to free X MB
|
and a reduced rate of expelling (as the task needs to be retriggered/scheduled between each vbucket).
Attachments
Issue Links
- relates to
-
MB-49170 Replica item count lagging active in Magma insert test
-
- Closed
-
For Gerrit Dashboard: MB-49262 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
165623,6 | MB-49262: Ensure expelling continues until the low mark is reached | master | kv_engine | Status: MERGED | +2 | +1 |