Details
-
Bug
-
Resolution: Fixed
-
Critical
-
5.0.1, 5.1.3, 5.5.6, 6.0.5, 6.5.1, 6.6.1, 6.5.2, Cheshire-Cat
-
Triaged
-
1
-
Yes
-
KV-Engine 2021-Feb
Description
HashTable::clear(), as used during Bucket flush to remove all items from the HashTable does not reset all statistics correctly. The following statistics retain their old values:
- numDeletedItems - used to calculate curr_items stat amongst others.
- numSystemItems - used to calculate curr_items stat amongst others.
- numPreparedSyncWrites - used to calculate curr_items stat amongst others.
- metaDataMemory - used by ItemPager to calculate pagable memory.
(Identified during investigation of MB-44452).
This issue dates back to 5.0.0, when numDeletedItems was added to HashTable, but wasn't reset - see http://review.couchbase.org/c/ep-engine/+/74130. When subsequent similar counters were added (numSystemItems, numPreparedSyncWrites) the same pattern was repeated.
Impact
If a bucket is flushed when any of the above {numXXXItems counts is non-zero, then the value of curr_items after the flush operation will not start at zero. This will result in the item counts for that Bucket being biased by the cleared amount, essentially forever. To encounter this one must:
- Have at least one of the above counters be non-zero
- Issue a Flush.
For (1), a Persistent Bucket in a quiesced state should have 0 Deleted items, 0 System items and 0 prepared SyncWrites, so the issue shouldn't occur. However if the bucket wasn't quiesced (SyncWrites in progress, deleted items being BG fetched) then they could be non-zero and issue could be hit.
For Ephemeral buckets the likelihood is greater - all three item types are kept in-memory for extended periods of time.
A such increasing the severity to Critical given many things in the system rely on accurate item counts.