Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7115

Rebalance operation failed repetitively while trying to rebalance in 5 nodes and rebalance out 3 nodes on a 5 node cluster, reason possibly because: "Unable to listen" to one of the nodes that was being rebalanced out.

    XMLWordPrintable

Details

    Description

      Scenario:

      • 10 node cluster with build 1942
      • Rebalance out 5 nodes (completed successfully)
      • Cluster right now: 5 nodes
      • Add 5 nodes (with build 1944) and remove 3 nodes.
      • Hit rebalance.
      • Rebalance failed with reason:

      Rebalance exited with reason {badmatch,
      [{<0.26283.119>,
      badmatch,{error,emfile,
      [

      {ns_replicas_builder_utils, kill_a_bunch_of_tap_names,3}

      ,

      {misc,try_with_maybe_ignorant_after,2}, {gen_server,terminate,6}, {proc_lib,init_p_do_apply,3}]}}]}

      - Tried rebalance again, but failed repetitively:

      Rebalance exited with reason {{{badmatch,[{<18058.14511.0>,noproc}]},
      [{misc,sync_shutdown_many_i_am_trapping_exits, 1},{misc,try_with_maybe_ignorant_after,2}

      ,

      {gen_server,terminate,6}

      ,

      {proc_lib,init_p_do_apply,3}

      ]},
      {gen_server,call,
      [<0.11023.120>,

      {shutdown_replicator, 'ns_1@ec2-54-251-5-97.ap-southeast-1.compute.amazonaws.com'}

      ,
      infinity]}}

      Will upload logs from one of the nodes in the cluster present in the cluster during the time of the rebalance failures, shortly.

      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

      Noticed this on one of the nodes being rebalanced out:
      Unable to listen on 'ns_1@ec2-122-248-217-156.ap-southeast-1.compute.amazonaws.com'.

      So failed over the node and tried rebalancing, rebalancing still failed.

      So added that node back, and did not involve that particular node in the rebalance operation, rebalance succeeded.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            abhinav Abhi Dangeti
            abhinav Abhi Dangeti
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty