Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53674

Cannot remove failed over after rebalance-in cancellation

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Yes
    • Analytics Sprint 3

    Description

      Reproduction scenario:
      1. Starting from a cluster with 1 node with data and analytics.
      2. Rebalance-in a second node with the Analytics service and cancel the rebalance before it completes

      3. Fail over the newly added node in step 2.
      4. Attempt to rebalance the cluster to remove the failed over node in step 3.

      Result:

      The rebalance will fail with the below error message and the Analytics service will continue to be unusable:

      timed out waiting for all nodes to join & cluster active

      This is a regression from 7.0.0.

      The only possible workaround is to manually update the partitions topology in metakv based on the cluster bad state using some undocumented API.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              umang.agrawal Umang
              murtadha.hubail Murtadha Hubail
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty