Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: 5.5.0
Affects Version/s: 5.5.0
Component/s: ns_server
Labels:
- rebalance

Triage:
Untriaged
Is this a Regression?:
Yes

Description

Create a 3-4 node Vulcan cluster with buckets and some data.
Add a node or remove a node from the cluster
Start Rebalance of the cluster
While the rebalance is in progress, inject failure into one of the node, like enablingor killing memcached.

Most of the time, the rebalance is not stopped immediately but goes to Hungary State for a while. Since we have a robust failure detection as a result of autofailover feature, we should be able to stop the rebalance almost immediately. In spock we did stop the rebalance within 2 mins. But that change has been reverted and hence we no longer stop the rebalance immediately.

Note that this is causing some of the autofailover tests to fail in Vulcan.

The issue can be reproduced by running following testrunner test

Create an ini file, examples can be found under b/resources folder in tersrunner. Then run the following against the cluster.

./testrunner -i <initial file > -t failover.AutoFailoverTests.AutoFailoverTests.test_autofailover_during_rebalance,timeout=5,num_node_failures=1,nodes_in=0,nodes_out=1,failover_action=stop_memcached,nodes_init=4,num_items=50000

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Ajit Yagaty [X] (Inactive)

Reporter:: Bharath G P

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 23/Mar/18 11:36 AM

Updated:: 28/Mar/18 2:40 AM

Resolved:: 25/Mar/18 9:04 PM

Gerrit Reviews

There are no open Gerrit changes

Stop rebalance soon if there are node failure during rebalance

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty