Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-59368

Ephemeral auto_delete bucket can trigger pager constantly with no memory reduction.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.6.0, 7.2.4
    • 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.1.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3, 7.2.1, 7.1.5, 7.2.2
    • couchbase-bucket
    • Untriaged
    • 0
    • Unknown
    • KV 2023-4

    Description

      This bug is for ephemeral with auto_delete.

      There is a mismatch in the quantities used to first trigger memory recovery and second to stop memory recovery. A bucket can be sized and utilised in such a way that the memory recovery is constantly triggered only to cancel, leading to noticeable CPU usage and a lack of any memory reduction. In this case memory recovery is the ItemPager.

      The problem exists in that functions used in the trigger code have persistent vs ephemeral implementations, with the ephemeral code deriving adjust values that aim to reflect active memory only (because replicas cannot be included in memory recovery). Later in the ItemPager unadjusted quantities are used leading to early termination with 0 memory recovered.

      In the case which triggered this MB a 27 node cluster has a small 110MiB quota ephemeral bucket with 38 replica and 37 active vbuckets (total of 75)

      • high-water mark = 93.5MiB
        • getPageableMemHighWatermark adjusted high-water mark = 46.12MiB
      • low-water mark = 82.5MiB
        • getPageableMemLowWatermark adjusted low-water mark = 40.7MiB
      • mem_used = 78MiB
        • getPageableMemCurrent harder confirm as we don't track the two replica memory values used in the implementation. However it has to be 46.12MiB or higher to trigger ItemPager

      So if mem_used sits at 78MiB it is enough to trigger the ItemPager but less than the low-water mark so pager does nothing.

      This issue is reprocubile if we recreate something simialr to the above. In this case trinity code was used (cluster_run).

      • COUCHBASE_NUM_VBUCKETS=60
      • cluster_run -n2
      • 1x ephemeral bucket with auto_delete and 110MiB quota.
      • Use pillow fight to keep inserting items until mem_used reaches ~78MiB (e.g 81,788,928 bytes)

      Reproduction can be noted by tracking the ep_num_pager_runs statistic, it is increasing fast.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              jwalker Jim Walker
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty