Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55537

Add bucket RSS-based memory condition trigger

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • Morpheus
    • master
    • couchbase-bucket
    • None
    • 0

    Description

      We've seen cases where RSS can be much higher than the allocation size of memcached. In the cases we've seen, we believe this is due to a combination or subset of the following:

      • memory fragmentation due to KV-Engine features which can result in documents being moved/deleted unexpectedly, like:
        • item compression (moves Blobs around)
        • item compression + Magma BG fetched items being uncompressed (until MB-53859)
        • document deletion/expiration
        • possibly others
      • Incorrect per-bucket RSS due to MB-55268, which affects the rate at which the defragmenter task can run

      In KV-Engine, we have ways to reduce memory usage or apply back pressure to memory consumers, including:

      • item eviction until low_wat
      • cursor dropping from the checkpoint manager 
      • temp OOM error codes returned to the front end
      • pausing DCP backfills
      • ...

      Currently, these get triggered conditionally based on the size of active allocations, but not RSS, which means it doesn't account for memory usage due to fragmentation. 

      Proposed Solution

      We might want to extend these triggers to also run when the bucket’s RSS is higher than some value, certainly if higher than max_size.

      In addition to the above measures, we might want to temporarily stop tasks such as the item compression task from running while in that high memory usage state, to avoid further heap fragmentation for that bucket, until RSS goes back to normal.

      Attachments

        Issue Links

          Activity

            People

              owend Daniel Owen
              vesko.karaganev Vesko Karaganev
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                PagerDuty