Details
-
Bug
-
Resolution: Duplicate
-
Major
-
6.5.0
-
6.5.0-4926
-
Untriaged
-
Centos 64-bit
-
Unknown
Description
Script to Repro
./testrunner -i /tmp/testexec.12412.ini -p get-cbcollect-info=False,get-cbcollect-info=True -t rebalance.rebalance_progress.RebalanceProgressTests.test_progress_add_back_after_failover,nodes_init=4,nodes_out=1,GROUP=P1,blob_generator=false
|
Steps to Repro
1) Create a 4 node cluster
[2019-12-05 16:12:12,135] - [rest_client:1503] INFO - rebalance params : {'password': 'password', 'ejectedNodes': '', 'user': 'Administrator', 'knownNodes': u'ns_1@172.23.105.237,ns_1@172.23.105.236,ns_1@172.23.97.66,ns_1@172.23.97.65'}
|
2) Load data
3) Create views
[2019-12-05 16:12:39,668] - [rest_client:552] INFO - index query url: http://172.23.105.236:8092/default/_design/default_view/_view/default_view0?stale=ok
|
[2019-12-05 16:12:39,783] - [task:2344] INFO - view : default_view0 was created successfully in ddoc: default_view
|
[2019-12-05 16:12:39,790] - [rest_client:552] INFO - index query url: http://172.23.105.236:8092/default/_design/default_view/_view/default_view1?stale=ok
|
[2019-12-05 16:12:39,799] - [task:2344] INFO - view : default_view1 was created successfully in ddoc: default_view
|
[2019-12-05 16:12:39,806] - [rest_client:552] INFO - index query url: http://172.23.105.236:8092/default/_design/default_view/_view/default_view2?stale=ok
|
[2019-12-05 16:12:39,815] - [task:2344] INFO - view : default_view2 was created successfully in ddoc: default_view
|
4)Start failover
[2019-12-05 16:12:44,664] - [rest_client:1448] INFO - fail_over node ns_1@172.23.97.66 successful
|
[2019-12-05 16:12:44,664] - [task:3508] INFO - 0 seconds sleep after failover, for nodes to go pending....
|
5)Do recovery
[2019-12-05 16:12:44,695] - [rest_client:1481] INFO - add_back_node ns_1@172.23.97.66 successful
|
6)Start rebalance
[2019-12-05 16:12:45,704] - [rest_client:1503] INFO - rebalance params : {'password': 'password', 'ejectedNodes': '', 'user': 'Administrator', 'knownNodes': u'ns_1@172.23.105.237,ns_1@172.23.105.236,ns_1@172.23.97.66,ns_1@172.23.97.65'}
|
While monitoring rebalance progress rebalance fails as shown below.
[2019-12-05 16:12:55,786] - [rest_client:3330] INFO - Latest logs from UI on 172.23.105.236:
|
[2019-12-05 16:12:55,786] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.105.236', u'code': 0, u'text': u'Rebalance exited with reason {mover_crashed,\n {unexpected_exit,\n {\'EXIT\',<0.6111.53>,\n {{error,\n {badrpc,\n {\'EXIT\',\n {{{{badmatch,{error,dcp_conn_closed}},\n [{couch_set_view_group,\n process_monitor_partition_update,4,\n [{file,\n "/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couch_set_view/src/couch_set_view_group.erl"},\n {line,3725}]},\n {couch_set_view_group,handle_call,3,\n [{file,\n "/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couch_set_view/src/couch_set_view_group.erl"},\n {line,934}]},\n {gen_server,try_handle_call,4,\n [{file,"gen_server.erl"},{line,636}]},\n {gen_server,handle_msg,6,\n [{file,"gen_server.erl"},{line,665}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},{line,247}]}]},\n {gen_server,call,\n [<12906.523.0>,\n {monitor_partition_update,1022,\n #Ref<12906.2127266913.807927810.106982>,\n <12906.570.0>},\n infinity]}},\n {gen_server,call,\n [\'capi_set_view_manager-default\',\n {wait_index_updated,1020},\n infinity]}}}}},\n {gen_server,call,\n [{\'janitor_agent-default\',\n \'ns_1@172.23.97.66\'},\n {if_rebalance,<0.5992.53>,\n {wait_index_updated,1022}},\n infinity]}}}}}.\nRebalance Operation Id = 81d9bb9bcc412a00769c9b7caf5f3683', u'shortText': u'message', u'serverTime': u'2019-12-05T16:12:50.209Z', u'module': u'ns_orchestrator', u'tstamp': 1575591170209, u'type': u'critical'}
|
cbcollect_info attached.