Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55268

Incorrect memory stats accounting due to automatic jemalloc tcache selection

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown
    • KV 2023-2

    Description

      We can cause the ep_arena_global:allocated/kv_daemon_memory_allocated_bytes/non-bucket allocation size to grow to an arbitrary amount by running a mutation-heavy workload with item sizes <4k.

      I've managed to reproduce on 7.1.1, 71.2, 7.1.3, and current master, with Magma and Couchstore buckets.

      Repro
      To reproduce:
      1. Create a bucket (bucket configuration appears irrelevant)
      2. Run cbc-pillowfight -I 10000000 -m 4000 -M 4000 -U couchbase://localhost/test -u Administrator -P $PASSWORD -r 100
      3. Observe the kv_daemon_memory_allocated_bytes climb to ~7 GiB.

      (non-bucket allocation is kv_daemon_memory_allocated_bytes). Notice that KV memory usage (kv_mem_used_bytes) + non-bucket allocation is > process memory usage, which is wrong (no swap configured).

      Cause
      I've managed to track this bug down to https://review.couchbase.org/c/platform/+/174991 (introduced in 7.1.1). Reverting that change fixes the inaccurate stat.

      Before the change, we used to manually initialise and toggle jemalloc's tcache. The change removes that manual control and instead leaves it up to jemalloc. The change also suggests "possible some "drift" in the ArenaMalloc stats". 

      More work is necessary to determine exactly why this is manifesting in this way.

      Impact

      One of the ways in which we manage memory fragmentation in KV-Engine is by actively defragmenting item allocations. We do this using a background task called the DefragmenterTask which runs for each bucket. The rate at which this task runs is determined by the defragmenter_mode configuration parameter, which can be one of auto_pid (the default), auto_linear and static (the old default).

      The only place where this stat is used is in the DefragmenterTask and only in auto_pid (default) and auto_linear modes. These two modes use the kv_daemon_memory_allocated_bytes and kv_daemon_memory_resident_bytes (both stats are affected) to determine fragmentation and increase/decrease fragmentation rate.

      In effect, when this stat is incorrect, the defragmenter could run at an unexpected rate, allowing for greater than normal memory fragmentation.

      We're not aware of any other possible adverse effects and we're not using this stat for anything else internally.
      All other memory stats are accounted for independently and will have their correct values.

       

       

      Issue Resolution
      A shared allocation cache (tcache) between buckets resulted in a stats drift. This caused higher-than-normal memory fragmentation. Dedicated tcaches are now used for buckets. jemalloc has been changed to support increased numbers of tcaches.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-55268
          # Subject Branch Project Status CR V

          Activity

            People

              vesko.karaganev Vesko Karaganev
              vesko.karaganev Vesko Karaganev
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty