Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49262

Checkpoint expel stops before low mark

    XMLWordPrintable

Details

    Description

      Introduced in http://review.couchbase.org/c/kv_engine/+/163330

      Expelling checks after every vbucket if further reduction in memory usage is required

          const auto vbuckets = bucket.getVBuckets().getVBucketsSortedByChkMgrMem();
          for (const auto& it : vbuckets) {
              const auto vbid = it.first;
              VBucketPtr vb = bucket.getVBucket(vbid);
              if (!vb) {
                  continue;
              }
       
              const auto expelResult =
                      vb->checkpointManager->expelUnreferencedCheckpointItems();
              EP_LOG_DEBUG(
                      "Expelled {} unreferenced checkpoint items "
                      "from {} "
                      "and estimated to have recovered {} bytes.",
                      expelResult.count,
                      vbid,
                      expelResult.memory);
       
              if (bucket.getRequiredCheckpointMemoryReduction() == 0) {
                  // All done
                  return ReductionRequired::No;
              }
          }
      

      size_t KVBucket::getRequiredCheckpointMemoryReduction() const {
          const auto checkpointMemoryRatio = getCheckpointMemoryRatio();
          const auto checkpointQuota = stats.getMaxDataSize() * checkpointMemoryRatio;
          const auto recoveryThreshold =
                  checkpointQuota * getCheckpointMemoryRecoveryUpperMark();
          const auto usage = stats.getCheckpointManagerEstimatedMemUsage();
       
          if (usage < recoveryThreshold) {
              return 0;
          }
       
          const auto lowerRatio = getCheckpointMemoryRecoveryLowerMark();
          const auto lowerMark = checkpointQuota * lowerRatio;
          Expects(usage > lowerMark);
          const size_t amountOfMemoryToClear = usage - lowerMark;
       
          const auto toMB = [](size_t bytes) { return bytes / (1024 * 1024); };
          const auto upperRatio = getCheckpointMemoryRecoveryUpperMark();
          EP_LOG_DEBUG(
                  "Triggering memory recovery as checkpoint memory usage ({} MB) "
                  "exceeds the upper_mark ({}, "
                  "{} MB) - total checkpoint quota {}, {} MB . Attempting to free {} "
                  "MB of memory.",
                  toMB(usage),
                  upperRatio,
                  toMB(checkpointQuota * upperRatio),
                  checkpointMemoryRatio,
                  toMB(checkpointQuota),
                  toMB(amountOfMemoryToClear));
       
          return amountOfMemoryToClear;
      }
      

      getRequiredCheckpointMemoryReduction boils down to:

      If checkpoint memory usage exceeds high mark:
       -> amount of memory to recover to reach the low mark
      else:
       -> 0
      

      Checking after every vbucket means expelling will often stop slightly below the high mark.

      Anecdotally, this has been seen in cluster run to lead to each run of the ClosedUnrefCheckpointRemoverTask expelling from a single vbucket, then ending. This leads to a lot of logging of:

      ClosedUnrefCheckpointRemoverTask:0 Triggering checkpoint memory recovery - attempting to free X MB
      

      and a reduced rate of expelling (as the task needs to be retriggered/scheduled between each vbucket).

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            No work has yet been logged on this issue.

            People

              james.harrison James Harrison
              james.harrison James Harrison
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty