Details
-
Bug
-
Resolution: Fixed
-
Critical
-
3.0
-
Security Level: Public
-
None
-
1:10.3.5.115
2:10.3.5.116
3:10.3.5.117
4:10.3.5.118
5:10.6.2.185
6:10.6.2.186
7:10.5.3.5
-
Untriaged
-
Centos 64-bit
-
Yes
Description
3.0.0-1139, occured both in centos 6x and ubuntu 12.04. We did not see this issue in 1105 (Beta Refresh)
Test Case:: ./testrunner -i centos.ini -t swaprebalance.SwapRebalanceFailedTests.test_failed_swap_rebalance,replica=1,num-buckets=4,num-swap=2,swap-orchestrator=True,percentage_progress=30,skip_cleanup=True,GROUP=P0
1. Create 3 Node cluster
2. Create 4 default bucket with 1000k items
3. Rebalance-in 2 nodes. Rebalance-out 2 Nodes with mutations running in parallel
4. During Rebalance in step 2 kill memcached on one of the nodes to force rebalance exit
5. After a while restart rebalance as in Step 3
After Step 4, we saw bad replicator message in logs, which resulted in Step 5 rebalance exit
MESSAGE FROM LOG
Bad replicators after rebalance:
Missing = [
Extras = []
Event Module Code Server Node Time
Rebalance exited with reason bad_replicas
(repeated 1 times) ns_orchestrator002 ns_1@10.3.5.115 22:00:27 - Tue Aug 12, 2014
Bad replicators after rebalance:
Missing = [{'ns_1@10.6.2.185','ns_1@10.3.5.118',6}
]
Extras = [] ns_rebalancer002 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
Bucket "bucket-3" rebalance appears to be swap rebalance ns_vbucket_mover000 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
Started rebalancing bucket bucket-3 ns_rebalancer000 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
Starting rebalance, KeepNodes = ['ns_1@10.3.5.118','ns_1@10.3.5.116',
'ns_1@10.6.2.185'], EjectNodes = ['ns_1@10.3.5.115',
'ns_1@10.3.5.117'], Failed over and being ejected nodes = []; no delta recovery nodes
ns_orchestrator004 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
Control connection to memcached on 'ns_1@10.3.5.118' disconnected: {badmatch,
{error,
closed}} (repeated 1 times) ns_memcached000 ns_1@10.3.5.118 21:59:51 - Tue Aug 12, 2014
Bucket "bucket-3" loaded on node 'ns_1@10.3.5.118' in 0 seconds. (repeated 1 times) ns_memcached000 ns_1@10.3.5.118 21:59:51 - Tue Aug 12, 2014
Bucket "bucket-2" loaded on node 'ns_1@10.3.5.118' in 0 seconds. (repeated 1 times) ns_memcached000 ns_1@10.3.5.118 21:59:51 - Tue Aug 12, 2014
Shutting down bucket "bucket-2" on 'ns_1@10.3.5.115' for deletion ns_memcached000 ns_1@10.3.5.115 21:59:34 - Tue Aug 12, 2014
Shutting down bucket "bucket-2" on 'ns_1@10.3.5.117' for deletion ns_memcached000 ns_1@10.3.5.117 21:59:34 - Tue Aug 12, 2014
Rebalance exited with reason bad_replicas
ns_orchestrator002 ns_1@10.3.5.115 21:59:34 - Tue Aug 12, 2014
Bad replicators after rebalance:
Missing = [
,
{'ns_1@10.3.5.116','ns_1@10.3.5.118',24},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',25},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',30},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',35},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',36},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',37},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',44},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',45},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',46},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',47},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',48},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',50},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',51},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',52},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',53},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',54},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',55},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',113},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',114},
{'ns_1@10.3.5.116','ns_1@10.3.5.118',115},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',41},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',42},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',43},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',49},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',67},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',68},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',69},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',70},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',71},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',101},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',102},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',103},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',106},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',107},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',108},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',116},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',117},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',118},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',122},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',123},
{'ns_1@10.3.5.118','ns_1@10.3.5.116',124},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',38},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',39},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',40},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',62},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',63},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',64},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',65},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',66},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',72},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',73},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',80},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',81},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',82},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',89},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',90},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',91},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',98},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',99},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',100},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',104},
{'ns_1@10.3.5.118','ns_1@10.6.2.185',105},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',4},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',5},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',6},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',9},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',10},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',11},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',12},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',13},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',14},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',15},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',16},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',17},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',18},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',26},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',27},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',28},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',29},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',31},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',110},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',111},
{'ns_1@10.6.2.185','ns_1@10.3.5.118',112}]
Extras = []