Description
- Create a 3-4 node Vulcan cluster with buckets and some data.
- Add a node or remove a node from the cluster
- Start Rebalance of the cluster
- While the rebalance is in progress, inject failure into one of the node, like enablingor killing memcached.
Most of the time, the rebalance is not stopped immediately but goes to Hungary State for a while. Since we have a robust failure detection as a result of autofailover feature, we should be able to stop the rebalance almost immediately. In spock we did stop the rebalance within 2 mins. But that change has been reverted and hence we no longer stop the rebalance immediately.
Note that this is causing some of the autofailover tests to fail in Vulcan.
The issue can be reproduced by running following testrunner test
Create an ini file, examples can be found under b/resources folder in tersrunner. Then run the following against the cluster.
./testrunner -i <initial file > -t failover.AutoFailoverTests.AutoFailoverTests.test_autofailover_during_rebalance,timeout=5,num_node_failures=1,nodes_in=0,nodes_out=1,failover_action=stop_memcached,nodes_init=4,num_items=50000