Details
-
Epic
-
Resolution: Fixed
-
Major
-
5.1.0
-
KV: Robust Rebalance
Description
Use Case:
Rebalance is a key feature of Couchbase that is used for online upgrades, node failover and high availability, and load balancing.
Rebalance involves significant data movement by many Couchbase services - KV and GSI especially. It is also used in High Data Density environments (>10TB data per node).
Concerns and Corresponding Improvements:
- Robustness: Rebalance is sensitive to failures and requires manual intervention to restart. Reduce failures and restart rebalance automatically wherever feasible.
- Speed: Currently speed of rebalance is an issue especially in High Data Density environments where one can have >10TB data per node.
- See also the Byte-Based Backfill investigations.
- Memory Required by Rebalance can be significant and increases with the amount of data per node. This needs to be fixed.
- Impact of Rebalance to front end operations needs to be contained (including maintenance of working set)
- Visibility: Provide holistic rebalance status for entire cluster (across all Services)
- Cluster Health: Rebalance should prioritize replica maintenance over load balancing.