Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
6.0.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.5.1, 6.6.0, 6.6.1, 6.6.2, 6.5.2, 6.5.0, 6.6.3, 6.6.4, 6.6.5, 7.0.0, 7.0.1, 7.0.2, 7.0.3
-
None
-
Untriaged
-
1
-
No
Description
Problem
A number of users over the years have managed to misconfigured compaction which has causes outages.
Common cases are:
- Compaction schedule is set for too short of a period
- Compaction threshold is too high
- Metadata purge is set too low (causes issues with backup, XDCR and indexing)
For 99% of cases the defaults are pretty good, so I don't know why people are changing these settings.
Notes
I do feel that schedule compaction is a hack, it's saying the cluster is not sized suitable for the workload and it can catch up during off peak times. I would also add when Compaction schedule was added the impact compaction had the cluster was much bigger.
Suggestions
There are a few different options here:
- In the UI throw a warning and have explanations on what the settings do and how misconfiguration can cause outages
- Hide or move some of the compaction settings so they're not front and centre
- Compaction schedule and Metadata purge are the two I would move.
- Take the settings as recommendations and not hard rules. It will do compaction if disk space gets too low
- For metadata purge, it would be nice if the system could automatically handle this and only purge when it really needs.