Fixed
Pinned fields
Click on the next to a field label to start pinning.
Details
Assignee
Ashwin GovindarajuluAshwin GovindarajuluReporter
Vesko KaraganevVesko KaraganevStory Points
0Priority
MajorInstabug
Open Instabug
Details
Details
Assignee
Ashwin Govindarajulu
Ashwin GovindarajuluReporter
Vesko Karaganev
Vesko KaraganevStory Points
0
Priority
Instabug
Open Instabug
PagerDuty
PagerDuty
PagerDuty
Sentry
Sentry
Sentry
Zendesk Support
Zendesk Support
Zendesk Support
Created March 13, 2023 at 3:05 PM
Updated September 19, 2023 at 11:19 AM
Resolved April 14, 2023 at 2:42 PM
In the auto_pid mode, we use a PID controller to calculate the sleep time for the DefragmenterTask.
The sleep time can vary between defragmenter_auto_min_sleep (default: 0.6s) and
defragmenter_auto_max_sleep (default: 10s). We calculate the output of the PID controller and subtract that from the max sleep time.
The input to the PID controller is the current scoredFragmentation. The PID controller internally uses an "error value" which is the difference between the input and the set point (scoredFragmentation, default 7%). The error is essentially how far away we are from the target fragmentation of 7%.
The PID controller has 3 terms, of which we use 2 (one is set to 0):
The proportional term, which is the error multiplied by a constant factor P
The integral term (an accumulation of all errors, multiplied by a constant factor I)
The PID controller output is recalculated every defragmenter_auto_pid_dt (default: 30 seconds).
The P is currently set to 0.3. For 30% scored fragmentation, we'd get an proportional term 0.3 x 0.3 = 0.09s of sleep reduction. So even at 30% fragmentation, the PID controller would start running every 9.91s, and any other adjustment to sleep time would come from the integral term (accumulated over time).
This might not be the ideal behaviour in cases of bulk expiration/deletion workloads where we get a sudden increase in memory fragmentation and we might want a more immediate, proportional, response.
Issue
Resolution
Workloads involving bulk data ingestion or Time-To-Live (TTLs) expiring at the same time caused a sudden increase in memory fragmentation.
The defragmenter now runs more frequently to better cope with sudden increases in fragmentation.