Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0, 7.1.1
-
Untriaged
-
1
-
Yes
-
Analytics Sprint 3
Description
Reproduction scenario:
1. Starting from a cluster with 1 node with data and analytics.
2. Rebalance-in a second node with the Analytics service and cancel the rebalance before it completes
3. Fail over the newly added node in step 2.
4. Attempt to rebalance the cluster to remove the failed over node in step 3.
Result:
The rebalance will fail with the below error message and the Analytics service will continue to be unusable:
timed out waiting for all nodes to join & cluster active
|
This is a regression from 7.0.0.
The only possible workaround is to manually update the partitions topology in metakv based on the cluster bad state using some undocumented API.
Attachments
Issue Links
- links to