Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
5.0.0
-
Untriaged
-
-
Unknown
Description
- Create a cluster with 3+ nodes and bucket in it.
- Enable autofailover and set timeout to 5 sec
- Restart one of the machines in the cluster
- wait for auto failover and the machine to restart
- Add back the node to the cluster using either delta or full recovery
- Rebalance the cluster
The rebalance fails with following error stack.
Rebalance exited with reason \{unexpected_exit,\n \{\'EXIT\',<0.5036.1>,\n \{bulk_set_vbucket_state_failed,\n [\{\'ns_1@172.23.98.81\',\n \{\'EXIT\',\n \{\{\{\{\{case_clause,\n \{error,\n \{\{\{badmatch,\n \{error,\n \{\{badmatch,\{error,ehostunreach}},\n [\{mc_replication,connect,1,\n [\{file,\n "src/mc_replication.erl"},\n \{line,30}]},\n \{mc_replication,connect,1,\n [\{file,\n "src/mc_replication.erl"},\n \{line,49}]},\n \{dcp_proxy,connect,5,\n [\{file,"src/dcp_proxy.erl"},\n \{line,218}]},\n \{dcp_proxy,maybe_connect,2,\n [\{file,"src/dcp_proxy.erl"},\n \{line,201}]},\n \{dcp_producer_conn,init,2,\n [\{file,\n "src/dcp_producer_conn.erl"},\n \{line,31}]},\n \{dcp_proxy,init,1,\n [\{file,"src/dcp_proxy.erl"},\n \{line,50}]},\n \{gen_server,init_it,6,\n [\{file,"gen_server.erl"},\n \{line,304}]},\n \{proc_lib,init_p_do_apply,3,\n [\{file,"proc_lib.erl"},\n \{line,239}]}]}}},\n [\{dcp_replicator,init,1,\n [\{file,"src/dcp_replicator.erl"},\n \{line,50}]},\n \{gen_server,init_it,6,\n [\{file,"gen_server.erl"},\n \{line,304}]},\n \{proc_lib,init_p_do_apply,3,\n [\{file,"proc_lib.erl"},\n \{line,239}]}]},\n \{child,undefined,\n \{\'ns_1@172.23.98.79\',true},\n \{dcp_replicator,start_link,\n [\'ns_1@172.23.98.79\',"default",\n true]},\n temporary,60000,worker,\n [dcp_replicator]}}}},\n [\{dcp_sup,start_replicator,2,\n [\{file,"src/dcp_sup.erl"},\{line,54}]},\n \{dcp_sup,\n \'-manage_replicators/3-lc$^3/1-3-\',2,\n [\{file,"src/dcp_sup.erl"},\{line,81}]},\n \{dcp_replication_manager,handle_call,\n 3,\n [\{file,\n "src/dcp_replication_manager.erl"},\n \{line,87}]},\n \{gen_server,handle_msg,5,\n [\{file,"gen_server.erl"},\{line,585}]},\n \{proc_lib,init_p_do_apply,3,\n [\{file,"proc_lib.erl"},\{line,239}]}]},\n \{gen_server,call,\n [\'dcp_replication_manager-default\',\n \{manage_replicators,\n [\'ns_1@172.23.98.79\',\n \'ns_1@172.23.98.80\'],\n true},\n infinity]}},\n \{gen_server,call,\n [\'replication_manager-default\',\n \{change_vbucket_replication,341,\n \'ns_1@172.23.98.79\'},\n infinity]}},\n \{gen_server,call,\n [\{\'janitor_agent-default\',\n \'ns_1@172.23.98.81\'},\n \{if_rebalance,<0.4886.1>,\n \{update_vbucket_state,341,replica,\n undefined,\'ns_1@172.23.98.79\'}},\n infinity]}}}}]}}}', u'shortText': u'message', u'serverTime': u'2017-03-23T07:38:21.581Z', u'module': u'ns_orchestrator', u'tstamp': 1490279901581, u'type': u'critical'}
|
[2017-03-23 07:38:27,498] - [rest_client:2800] ERROR - \{u'node': u'ns_1@172.23.98.80', u'code': 0, u'text': u'<0.4912.1> exited with \{unexpected_exit,\n \{\'EXIT\',<0.5036.1>,\n \{bulk_set_vbucket_state_failed,\n [\{\'ns_1@172.23.98.81\',\n \{\'EXIT\',\n \{\{\{\{\{case_clause,\n \{error,\n \{\{\{badmatch,\n \{error,\n \{\{badmatch,\{error,ehostunreach}},\n [\{mc_replication,connect,1,\n [\{file,"src/mc_replication.erl"},\n \{line,30}]},\n \{mc_replication,connect,1,\n [\{file,"src/mc_replication.erl"},\n \{line,49}]},\n \{dcp_proxy,connect,5,\n [\{file,"src/dcp_proxy.erl"},\n \{line,218}]},\n \{dcp_proxy,maybe_connect,2,\n [\{file,"src/dcp_proxy.erl"},\n \{line,201}]},\n \{dcp_producer_conn,init,2,\n [\{file,"src/dcp_producer_conn.erl"},\n \{line,31}]},\n \{dcp_proxy,init,1,\n [\{file,"src/dcp_proxy.erl"},\n \{line,50}]},\n \{gen_server,init_it,6,\n [\{file,"gen_server.erl"},\n \{line,304}]},\n \{proc_lib,init_p_do_apply,3,\n [\{file,"proc_lib.erl"},\n \{line,239}]}]}}},\n [\{dcp_replicator,init,1,\n [\{file,"src/dcp_replicator.erl"},\n \{line,50}]},\n \{gen_server,init_it,6,\n [\{file,"gen_server.erl"},\{line,304}]},\n \{proc_lib,init_p_do_apply,3,\n [\{file,"proc_lib.erl"},\{line,239}]}]},\n \{child,undefined,\n \{\'ns_1@172.23.98.79\',true},\n \{dcp_replicator,start_link,\n [\'ns_1@172.23.98.79\',"default",true]},\n temporary,60000,worker,\n [dcp_replicator]}}}},\n [\{dcp_sup,start_replicator,2,\n [\{file,"src/dcp_sup.erl"},\{line,54}]},\n \{dcp_sup,\n \'-manage_replicators/3-lc$^3/1-3-\',2,\n [\{file,"src/dcp_sup.erl"},\{line,81}]},\n \{dcp_replication_manager,handle_call,3,\n [\{file,"src/dcp_replication_manager.erl"},\n \{line,87}]},\n \{gen_server,handle_msg,5,\n [\{file,"gen_server.erl"},\{line,585}]},\n \{proc_lib,init_p_do_apply,3,\n [\{file,"proc_lib.erl"},\{line,239}]}]},\n \{gen_server,call,\n [\'dcp_replication_manager-default\',\n \{manage_replicators,\n [\'ns_1@172.23.98.79\',\'ns_1@172.23.98.80\'],\n true},\n infinity]}},\n \{gen_server,call,\n [\'replication_manager-default\',\n \{change_vbucket_replication,341,\n \'ns_1@172.23.98.79\'},\n infinity]}},\n \{gen_server,call,\n [\{\'janitor_agent-default\',\'ns_1@172.23.98.81\'},\n \{if_rebalance,<0.4886.1>,\n \{update_vbucket_state,341,replica,undefined,\n \'ns_1@172.23.98.79\'}},\n infinity]}}}}]}}}
|
The test can be found here : http://qa.sc.couchbase.com/job/cen006-nserv-autofailover-machine-restart/12/consoleFull (tests 17, 18, 19 in the suite are the ones failing due to this issue.)