Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50690

Rebalance is reported as completed successfully even when analytics reports unsuccessful rebalance

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Not a Bug
    • 7.1.0
    • 7.1.0
    • analytics, ns_server
    • Enterprise Edition 7.1.0 build 2143

    Description

      Steps to reproduce - 

      1. Create 4 node cluster with 2 cbas and 2 KV nodes.
      2. Set cbas replica to 3.
      3. create cbas infra like datasets, dataverses and indexes.
      4. Actual number of replica will be 1, as there are only 2 cbas nodes.
      5. Now rebalance-In 2 more CBAS nodes.
      6. while rebalance is happening, stop couchbase server on one of the existing cbas nodes.
      7. rebalance fails as expected. Verify that no data loss happened on cbas side and the actual replica number is still 1.
      8. start the couchbase server that was stopped in step 6.
      9. rebalance again.

      Observation -

      On checking logs on WebUI, we can see that analytics reported rebalance failure, but ns-server reported that rebalance passed.

      Hot-reloaded memcached.json for config change of the following keys: [<<"scramsha_fallback_salt">>] (repeated 1 times, last seen 58.067673 secs ago)memcached_config_mgr 000ns_1@172.23.104.2179:34:06 PM 30 Jan, 2022
       
      Rebalance completed successfully.
      Rebalance Operation Id = f9728f5f1e58640e26f6d09702154175ns_orchestrator 000ns_1@172.23.104.1619:33:52 PM 30 Jan, 2022Analytics Service unable to successfully rebalance 929d7e5eeb4809f400ba94c836dcc0a4 due to 'HYR0003: Failure on node 08e19c5052c5c4d8f1b64ab038e93fb7'; see analytics_info.log for detailsanalytics 000ns_1@172.23.104.1799:33:51 PM 30 Jan, 2022
       
      Bucket "EExRjRyxSYOul-4-870000" rebalance appears to be swap rebalancens_vbucket_mover 000ns_1@172.23.104.1619:33:45 PM 30 Jan, 2022
       
      Started rebalancing bucket EExRjRyxSYOul-4-870000ns_rebalancer 000ns_1@172.23.104.1619:33:45 PM 30 Jan, 2022
       
      Starting rebalance, KeepNodes = ['ns_1@172.23.104.217','ns_1@172.23.104.163',
      'ns_1@172.23.104.201','ns_1@172.23.104.202',
      'ns_1@172.23.104.161','ns_1@172.23.104.179'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = f9728f5f1e58640e26f6d09702154175ns_orchestrator 000ns_1@172.23.104.1619:33:45 PM 30 Jan, 2022

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            michael.blow Michael Blow added a comment -

            This is working as designed. Due to complaints in MB-30766 that Analytics was failing a rebalance not involving any additions or removals of Analytics nodes, in 6.0.0 a change was made to not report rebalance failures to ns_server should the Analytics topology not be changing. In the rebalance in question, the KeepNodes is not changing, hence we do not report the rebalance as failed to ns_server.

            michael.blow Michael Blow added a comment - This is working as designed. Due to complaints in MB-30766 that Analytics was failing a rebalance not involving any additions or removals of Analytics nodes, in 6.0.0 a change was made to not report rebalance failures to ns_server should the Analytics topology not be changing. In the rebalance in question, the KeepNodes is not changing, hence we do not report the rebalance as failed to ns_server.
            umang.agrawal Umang added a comment -

            Closing the issue as the feature is working as intended.

            umang.agrawal Umang added a comment - Closing the issue as the feature is working as intended.

            People

              umang.agrawal Umang
              umang.agrawal Umang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty