Details
-
Bug
-
Resolution: Duplicate
-
Major
-
Cheshire-Cat
-
7.0.0-2351
-
Untriaged
-
Centos 64-bit
-
1
-
Unknown
Description
Build: 7.0.0-2351
Scenario:
- Four Node cluster, Couchbase bucket (replica=0, min_durability_level=persistToMajority)
- Update replica from '0 --> 1 --> 2' in incremental way with doc_loading after each replica update rebalance
- Bring down replica back to zero i.e. 2 --> 1 --> 0 in incremental way again with doc_loading
Observation:
While updating replica from 1 --> 0 in step#3, saw the below rebalance failure
'errorMessage': 'Rebalance failed. See logs for detailed reason. You can try again.', 'status': 'none' - rebalance failed
|
Latest logs from UI on 172.23.123.125:
|
'code': 0, 'module': 'ns_orchestrator', 'type': 'critical', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561059861L, 'shortText': 'message', 'serverTime': '2020-06-19T03:04:19.861Z', 'text': 'Rebalance exited with reason {mover_crashed,
|
nexpected_exit,
|
{\'EXIT\',<0.11370.7>,
|
{{{{{child_interrupted,
|
{\'EXIT\',<0.21116.1>,socket_closed}},
|
[{dcp_replicator,spawn_and_wait,1,
|
[{file,"src/dcp_replicator.erl"},
|
{line,266}]},
|
{dcp_replicator,handle_call,3,
|
[{file,"src/dcp_replicator.erl"},
|
{line,121}]},
|
{gen_server,try_handle_call,4,
|
[{file,"gen_server.erl"},{line,636}]},
|
{gen_server,handle_msg,6,
|
[{file,"gen_server.erl"},{line,665}]},
|
{proc_lib,init_p_do_apply,3,
|
[{file,"proc_lib.erl"},{line,247}]}]},
|
{gen_server,call,
|
[<0.21071.1>,
|
{setup_replication,
|
[4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,
|
19,20,21,22,23,24,25,26,27,28,29,30,31,
|
32,33,34,35,36,37,38,39,40,41,42,43,44,
|
45,46,47,48,49,50,51,52,53,54,55,56,57,
|
58,59,60,61,62,63,64,65,66,67,68,69,70,
|
71,72,73,74,75,76,77,78,79,80,81,82,83,
|
84]},
|
infinity]}},
|
{gen_server,call,
|
[\'replication_manager-default\',
|
{change_vbucket_replication,682,undefined},
|
infinity]}},
|
{gen_server,call,
|
[{\'janitor_agent-default\',
|
\'ns_1@172.23.123.125\'},
|
{if_rebalance,<0.2074.7>,
|
pdate_vbucket_state,849,active,
|
undefined,undefined,
|
[[\'ns_1@172.23.123.125\',
|
\'ns_1@172.23.123.121\'],
|
[\'ns_1@172.23.123.125\']]}},
|
infinity]}}}}}.
|
Rebalance Operation Id = 1a5386077bf37ecdd570807354619b6e'}
|
'code': 0, 'module': 'ns_vbucket_mover', 'type': 'critical', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561059856L, 'shortText': 'message', 'serverTime': '2020-06-19T03:04:19.856Z', 'text': 'Worker <0.11303.7> (for action {move,{849,
|
[\'ns_1@172.23.123.125\',
|
\'ns_1@172.23.123.121\'],
|
[\'ns_1@172.23.123.125\'],
|
[]}}) exited with reason nexpected_exit,
|
{\'EXIT\',
|
<0.11370.7>,
|
{{{{{child_interrupted,
|
{\'EXIT\',
|
<0.21116.1>,
|
socket_closed}},
|
[{dcp_replicator,
|
spawn_and_wait,
|
1,
|
[{file,
|
"src/dcp_replicator.erl"},
|
{line,
|
266}]},
|
{dcp_replicator,
|
handle_call,
|
3,
|
[{file,
|
"src/dcp_replicator.erl"},
|
{line,
|
121}]},
|
{gen_server,
|
try_handle_call,
|
4,
|
[{file,
|
"gen_server.erl"},
|
{line,
|
636}]},
|
{gen_server,
|
handle_msg,
|
6,
|
[{file,
|
"gen_server.erl"},
|
{line,
|
665}]},
|
{proc_lib,
|
init_p_do_apply,
|
3,
|
[{file,
|
"proc_lib.erl"},
|
{line,
|
247}]}]},
|
{gen_server,
|
call,
|
[<0.21071.1>,
|
{setup_replication,
|
[4,5,6,
|
7,8,9,
|
10,11,
|
12,13,
|
14,15,
|
16,17,
|
18,19,
|
20,21,
|
22,23,
|
24,25,
|
26,27,
|
28,29,
|
30,31,
|
32,33,
|
34,35,
|
36,37,
|
38,39,
|
40,41,
|
42,43,
|
44,45,
|
46,47,
|
48,49,
|
50,51,
|
52,53,
|
54,55,
|
56,57,
|
58,59,
|
60,61,
|
62,63,
|
64,65,
|
66,67,
|
68,69,
|
70,71,
|
72,73,
|
74,75,
|
76,77,
|
78,79,
|
80,81,
|
82,83,
|
84]},
|
infinity]}},
|
{gen_server,
|
call,
|
[\'replication_manager-default\',
|
{change_vbucket_replication,
|
682,
|
undefined},
|
infinity]}},
|
{gen_server,
|
call,
|
[{\'janitor_agent-default\',
|
\'ns_1@172.23.123.125\'},
|
{if_rebalance,
|
<0.2074.7>,
|
pdate_vbucket_state,
|
849,
|
active,
|
undefined,
|
undefined,
|
[[\'ns_1@172.23.123.125\',
|
\'ns_1@172.23.123.121\'],
|
[\'ns_1@172.23.123.125\']]}},
|
infinity]}}}}'
|
'code': 0, 'module': 'ns_vbucket_mover', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561057634L, 'shortText': 'message', 'serverTime': '2020-06-19T03:04:17.634Z', 'text': 'Bucket "default" rebalance does not seem to be swap rebalance'
|
'code': 0, 'module': 'ns_rebalancer', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561057562L, 'shortText': 'message', 'serverTime': '2020-06-19T03:04:17.562Z', 'text': 'Started rebalancing bucket default'
|
'code': 0, 'module': 'ns_orchestrator', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561057439L, 'shortText': 'message', 'serverTime': '2020-06-19T03:04:17.439Z', 'text': u"Starting rebalance, KeepNodes = ['ns_1@172.23.123.124','ns_1@172.23.123.125',
|
'ns_1@172.23.123.119','ns_1@172.23.123.121'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 1a5386077bf37ecdd570807354619b6e"
|
'code': 0, 'module': 'menelaus_web_buckets', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561053371L, 'shortText': 'message', 'serverTime': '2020-06-19T03:04:13.371Z', 'text': 'Updated bucket "default" (of type couchbase) properties:
|
[{num_replicas,0},{ram_quota,1493172224},{storage_mode,couchstore}]'
|
'code': 0, 'module': 'ns_orchestrator', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561045087L, 'shortText': 'message', 'serverTime': '2020-06-19T03:04:05.087Z', 'text': 'Rebalance completed successfully.
|
Rebalance Operation Id = e4e7e898047642367b4bfecef7f52fc8'
|
'code': 0, 'module': 'ns_vbucket_mover', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561036461L, 'shortText': 'message', 'serverTime': '2020-06-19T03:03:56.461Z', 'text': 'Bucket "default" rebalance does not seem to be swap rebalance (repeated 1 times, last seen 0.752024 secs ago)'
|
'code': 0, 'module': 'ns_rebalancer', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561036461L, 'shortText': 'message', 'serverTime': '2020-06-19T03:03:56.461Z', 'text': 'Started rebalancing bucket default (repeated 1 times, last seen 0.855567 secs ago)'
|
'code': 0, 'module': 'menelaus_web_buckets', 'type': 'info', 'node': 'ns_1@172.23.123.125', 'tstamp': 1592561036461L, 'shortText': 'message', 'serverTime': '2020-06-19T03:03:56.461Z', 'text': 'Updated bucket "default" (of type couchbase) properties:
|
[{num_replicas,1},{ram_quota,1493172224},{storage_mode,couchstore}] (repeated 1 times, last seen 5.654143 secs ago)'
|
Rebalance Failed: 'errorMessage': 'Rebalance failed. See logs for detailed reason. You can try again.', 'status': 'none' - rebalance failed
|
cbcollect_logs:
https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_fail/collectinfo-2020-06-19T200246-ns_1%40172.23.105.155.zip
https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_fail/collectinfo-2020-06-19T200246-ns_1%40172.23.105.159.zip
https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_fail/collectinfo-2020-06-19T200246-ns_1%40172.23.105.205.zip
https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_fail/collectinfo-2020-06-19T200246-ns_1%40172.23.105.206.zip
Test execution link: http://qa.sc.couchbase.com/job/oel6-4node-rebalance_in_jython/983/console