Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-28880

Stop rebalance soon if there are node failure during rebalance

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 5.5.0
    • 5.5.0
    • ns_server
    • Untriaged
    • Yes

    Description

      1. Create a 3-4 node Vulcan cluster with buckets and some data. 
      2. Add a node or remove a node from the cluster
      3. Start Rebalance of the cluster
      4. While the rebalance is in progress, inject failure into one of the node, like enablingor killing memcached. 

      Most of the time, the rebalance is not stopped immediately but goes to Hungary State for a while. Since we have a robust failure detection as a result of autofailover feature, we should be able to stop the rebalance almost immediately. In spock we did stop the rebalance within 2 mins. But that change has been reverted and hence we no longer stop the rebalance immediately.

       

      Note that this is causing some of the autofailover tests to fail in Vulcan. 

       

      The issue can be reproduced by running following testrunner test

      Create an ini file, examples can be found under b/resources folder in tersrunner. Then run the following against the cluster. 

      ./testrunner -i <initial file > -t failover.AutoFailoverTests.AutoFailoverTests.test_autofailover_during_rebalance,timeout=5,num_node_failures=1,nodes_in=0,nodes_out=1,failover_action=stop_memcached,nodes_init=4,num_items=50000

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ajit.yagaty Ajit Yagaty [X] (Inactive)
            bharath.gp Bharath G P
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty