Description
vbuckets can go unreplicated for a long time if a replication connection between nodes is down during rebalance.
The replications are setup by the janitor which does not run during rebalance. In a case were a replication (say node1->node2) is down the streams for some vbuckets will be down as well, affecting the replications.
The rebalance will fail when we monitor specific replication during different stages of rebalance and if such a replication is down. However, in certain cases as with the linked CBSE we may never monitor the downed replication, and therefore might not fail rebalance for an extended period of time.
Attachments
Issue Links
- duplicates
-
MB-49806 memcached restart during rebalance can result in some replications being missing for a long time (was Rebalance exited with reason bad_replicas)
- Open