Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 7.1.0
Affects Version/s: 7.1.0
Component/s: couchbase-bucket
Labels:
- request-dev-verify

Triage:
Triaged
Epic Link:
KV: 1% Residency Ratio
Story Points:
1
Is this a Regression?:
Yes
Sprint:
KV 2021-Nov

Description

Introduced in http://review.couchbase.org/c/kv_engine/+/163330

Expelling checks after every vbucket if further reduction in memory usage is required

    const auto vbuckets = bucket.getVBuckets().getVBucketsSortedByChkMgrMem();

    for (const auto& it : vbuckets) {

        const auto vbid = it.first;

        VBucketPtr vb = bucket.getVBucket(vbid);

        if (!vb) {

            continue;

        const auto expelResult =

                vb->checkpointManager->expelUnreferencedCheckpointItems();

        EP_LOG_DEBUG(

                "Expelled {} unreferenced checkpoint items "

                "from {} "

                "and estimated to have recovered {} bytes.",

                expelResult.count,

                vbid,

                expelResult.memory);

        if (bucket.getRequiredCheckpointMemoryReduction() == 0) {

            // All done

            return ReductionRequired::No;

size_t KVBucket::getRequiredCheckpointMemoryReduction() const {

    const auto checkpointMemoryRatio = getCheckpointMemoryRatio();

    const auto checkpointQuota = stats.getMaxDataSize() * checkpointMemoryRatio;

    const auto recoveryThreshold =

            checkpointQuota * getCheckpointMemoryRecoveryUpperMark();

    const auto usage = stats.getCheckpointManagerEstimatedMemUsage();

    if (usage < recoveryThreshold) {

        return 0;

    const auto lowerRatio = getCheckpointMemoryRecoveryLowerMark();

    const auto lowerMark = checkpointQuota * lowerRatio;

    Expects(usage > lowerMark);

    const size_t amountOfMemoryToClear = usage - lowerMark;

    const auto toMB = [](size_t bytes) { return bytes / (1024 * 1024); };

    const auto upperRatio = getCheckpointMemoryRecoveryUpperMark();

    EP_LOG_DEBUG(

            "Triggering memory recovery as checkpoint memory usage ({} MB) "

            "exceeds the upper_mark ({}, "

            "{} MB) - total checkpoint quota {}, {} MB . Attempting to free {} "

            "MB of memory.",

            toMB(usage),

            upperRatio,

            toMB(checkpointQuota * upperRatio),

            checkpointMemoryRatio,

            toMB(checkpointQuota),

            toMB(amountOfMemoryToClear));

    return amountOfMemoryToClear;

getRequiredCheckpointMemoryReduction boils down to:

If checkpoint memory usage exceeds high mark:

 -> amount of memory to recover to reach the low mark

else:

 -> 0

Checking after every vbucket means expelling will often stop slightly below the high mark.

Anecdotally, this has been seen in cluster run to lead to each run of the ClosedUnrefCheckpointRemoverTask expelling from a single vbucket, then ending. This leads to a lot of logging of:

ClosedUnrefCheckpointRemoverTask:0 Triggering checkpoint memory recovery - attempting to free X MB

and a reduced rate of expelling (as the task needs to be retriggered/scheduled between each vbucket).

Attachments

Issue Links

relates to

MB-49170 Replica item count lagging active in Magma insert test

Closed

Activity

People

Assignee:: James Harrison (Inactive)

Reporter:: James Harrison (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 01/Nov/21 8:52 AM

Updated:: 13/Jan/22 1:54 AM

Resolved:: 17/Nov/21 1:54 AM

Checkpoint expel stops before low mark