Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
master
-
None
-
0
Description
We've seen cases where RSS can be much higher than the allocation size of memcached. In the cases we've seen, we believe this is due to a combination or subset of the following:
- memory fragmentation due to KV-Engine features which can result in documents being moved/deleted unexpectedly, like:
- item compression (moves Blobs around)
- item compression + Magma BG fetched items being uncompressed (until MB-53859)
- document deletion/expiration
- possibly others
- Incorrect per-bucket RSS due to
MB-55268, which affects the rate at which the defragmenter task can run
In KV-Engine, we have ways to reduce memory usage or apply back pressure to memory consumers, including:
- item eviction until low_wat
- cursor dropping from the checkpoint manager
- temp OOM error codes returned to the front end
- pausing DCP backfills
- ...
Currently, these get triggered conditionally based on the size of active allocations, but not RSS, which means it doesn't account for memory usage due to fragmentation.
Proposed Solution
We might want to extend these triggers to also run when the bucket’s RSS is higher than some value, certainly if higher than max_size.
In addition to the above measures, we might want to temporarily stop tasks such as the item compression task from running while in that high memory usage state, to avoid further heap fragmentation for that bucket, until RSS goes back to normal.
Attachments
Issue Links
- is blocked by
-
MB-55268 Incorrect memory stats accounting due to automatic jemalloc tcache selection
- Closed