Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44272

Rebalance exited with reason shun_failed

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • Unknown

    Description

      Add node .101 to .103 is failing because rebalance is not successful. Subsequent calls to add this node, after rebalance has completed, also fail.

      [2021-02-10 11:24:28,755] - [rest_client:1485] INFO - adding remote node @172.23.123.101:8091 to this cluster @172.23.123.103:8091
      [2021-02-10 11:24:28,773] - [rest_client:1016] ERROR - POST http://172.23.123.103:8091/controller/addNode body: hostname=http%3A%2F%2F172.23.123.101%3A8091&user=Administrator&password=password headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Accept': '*/*'} error: 400 reason: unknown b'["Node addition is disallowed while rebalance is in progress"]' auth: Administrator:password
       
       
      [2021-02-10 11:24:29,236] - [rest_client:3732] INFO - Latest logs from UI on 172.23.123.103:
      [2021-02-10 11:24:29,236] - [rest_client:3733] ERROR - {'node': 'ns_1@172.23.123.103', 'type': 'info', 'code': 5, 'module': 'ns_cluster', 'tstamp': 1612985068772, 'shortText': 'message', 'text': 'Failed to add node 172.23.123.101:8091 to cluster. Node addition is disallowed while rebalance is in progress', 'serverTime': '2021-02-10T11:24:28.772Z'}
      [2021-02-10 11:24:29,236] - [rest_client:3733] ERROR - {'node': 'ns_1@172.23.123.103', 'type': 'info', 'code': 0, 'module': 'mb_master', 'tstamp': 1612985035933, 'shortText': 'message', 'text': "I'm the only node, so I'm the master.", 'serverTime': '2021-02-10T11:23:55.933Z'}
      

      ns_server.error.log:

      [ns_server:error,2021-02-10T11:15:00.485-08:00,ns_1@172.23.123.103:<0.2168.7>:ns_cluster:shun:608]Shun failed with {exit,
                           {timeout,
                               {gen_server,call,
                                   [{via,leader_registry,chronicle_master},
                                    {remove_peer,'ns_1@172.23.123.101'}]}},
                           [{gen_server,call,2,[{file,"gen_server.erl"},{line,215}]},
                            {ns_cluster,shun,1,
                                [{file,"src/ns_cluster.erl"},{line,603}]},
                            {lists,foreach,2,[{file,"lists.erl"},{line,1338}]},
                            {ns_rebalancer,rebalance_body,5,
                                [{file,"src/ns_rebalancer.erl"},{line,562}]},
                            {async,'-async_init/4-fun-1-',3,
                                [{file,"src/async.erl"},{line,197}]}]}
      [user:error,2021-02-10T11:15:00.487-08:00,ns_1@172.23.123.103:<0.5370.0>:ns_orchestrator:log_rebalance_completion:1406]Rebalance exited with reason shun_failed.
      Rebalance Operation Id = 29c37a9c3b598d8b6073655c5aa1eb3f
      [ns_server:error,2021-02-10T11:15:23.467-08:00,ns_1@172.23.123.103:<0.3990.7>:ns_rebalance_observer:generic_get_call:114]Unexpected exception {exit,
                               {noproc,
                                   {gen_server,call,
                                       [{via,leader_registry,ns_rebalance_observer},
                                        get_aggregated_progress,10000]}}}
      [ns_server:error,2021-02-10T11:15:23.467-08:00,ns_1@172.23.123.103:<0.3990.7>:rebalance:progress:153]Couldn't reach ns_rebalance_observer
      

      First failure logs end with 1124, second end with 1130.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-44272
          # Subject Branch Project Status CR V

          Activity

            People

              artem Artem Stemkovski
              pavithra.mahamani Pavithra Mahamani (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty