Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.2.5, 7.6.1
-
Triaged
-
0
-
No
Description
Flush stats (fdSz and hdrSz) are under accounted in newPgOperator for complex pages with merge deltas. This discrepancy leads to inaccurate calculations(over-accounting) of FlushDataSz and FlushHdrSz in memory.
Consequently, LSS cleaners compute fragmentation incorrectly and run less frequently, causing accumulation of stale data on disk and resulting in disk bloat, particularly evident in workloads with frequent merges (e.g: timeseries data). A slow mutation rate aggravates the issue.
Example:
After the merge delta is added, if we have a parent page like:
"low:": <ud>(key- 401, sn:2, insert:true)</ud> (len:44), |
"high:": maxItem (len:7), |
"chainLen:": 3, |
"numItems:": 0, |
"state:": 8006, |
"version:": 6, |
"flushed:": true, |
"evicted:": false, |
"compressed:": false |
|
0 merge: op compress[false]purge[false]empty[false]op[opPageMergeDelta] |
0 delta: op compress[false]purge[false]empty[false]op[opPageRemoveDelta] ptr[0x10eb4c000] |
1 flush: op compress[false]purge[false]empty[false]op[opRelocPageDelta] NumRecords 0 NumSegments 1 bloomFilter: <nil> flushDataSz: 43, flushHdrSz:80 |
2 base: |
1 flush: op compress[false]purge[false]empty[false]op[opRelocPageDelta] NumRecords 0 NumSegments 1 bloomFilter: <nil> flushDataSz: 53, flushHdrSz:124 |
2 base: |
After compaction, we'd expect the staleDataSz to be 43+53=96 and staleHdrSz to be 80+124=204
After compaction the page becomes,
Plasma:
"low:": <ud>(key- 401, sn:2, insert:true)</ud> (len:44), |
"high:": maxItem (len:7), |
"chainLen:": 0, |
"numItems:": 0, |
"state:": 7, |
"version:": 7, |
"flushed:": false, |
"evicted:": false, |
"compressed:": false |
|
0 base: |
But the staleDataSz returned is 43 and staleHdrSz returned is 80 .
This causes us to under-subtract the flush stats which eventually leads to a over-counting of these stats in memory.
The stats persisted on disk are correct. Because of this recovery is able to correct the situation.
Workaround:
Recovery log blocks correctly persist flushDataSz and flushHdrSz without issues. During recovery, the in-memory stats FlushDataSz and FlushHdrSz are recomputed. Restarting the indexer process serves as a temporary fix.