Details
-
Bug
-
Resolution: Won't Do
-
Major
-
4.1.2
-
Untriaged
-
Unknown
Description
./testrunner -i INI_FILE get-cbcollect-info=True,get-logs=False,stop-on-failure=False,get-coredumps=True,demand_encryption=1,fail_on_errors=1,GROUP=ALL -t xdcr.rebalanceXDCR.Rebalance.async_rebalance_in,items=100000,rdirection=bidirection,ctopology=chain,update=C1-C2,delete=C1-C2,expires=240,rebalance=C1-C2,num_rebalance=1,GROUP=P1
This test failed with following error during rebalance:
[2016-07-12 06:42:11,743] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
|
[2016-07-12 06:42:21,763] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
|
[2016-07-12 06:42:31,784] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
|
[2016-07-12 06:42:41,805] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
|
[2016-07-12 06:42:51,825] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
|
[2016-07-12 06:43:01,847] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
|
[2016-07-12 06:43:11,869] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
|
[2016-07-12 06:43:21,889] - [rest_client:1401] ERROR - {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed
|
[2016-07-12 06:43:22,070] - [rest_client:2605] INFO - Latest logs from UI on 172.23.105.20:
|
[2016-07-12 06:43:22,070] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 2, u'text': u'Rebalance exited with reason {unexpected_exit,\n {\'EXIT\',<0.16720.23>,\n {{{{{child_interrupted,\n {\'EXIT\',<0.5598.23>,normal}},\n [{dcp_replicator,spawn_and_wait,1,\n [{file,"src/dcp_replicator.erl"},\n {line,231}]},\n {dcp_replicator,handle_call,3,\n [{file,"src/dcp_replicator.erl"},\n {line,115}]},\n {gen_server,handle_msg,5,\n [{file,"gen_server.erl"},{line,585}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},{line,239}]}]},\n {gen_server,call,\n [\'dcp_replicator-default-ns_1@172.23.105.67\',\n {takeover,808},\n infinity]}},\n {gen_server,call,\n [\'replication_manager-default\',\n {change_vbucket_replication,465,undefined},\n infinity]}},\n {gen_server,call,\n [{\'janitor_agent-default\',\n \'ns_1@172.23.105.20\'},\n {if_rebalance,<0.9984.23>,\n {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n infinity]}}}}\n', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.403Z', u'module': u'ns_orchestrator', u'tstamp': 1468330014403, u'type': u'info'}
|
[2016-07-12 06:43:22,070] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'<0.16626.23> exited with {unexpected_exit,\n {\'EXIT\',<0.16720.23>,\n {{{{{child_interrupted,{\'EXIT\',<0.5598.23>,normal}},\n [{dcp_replicator,spawn_and_wait,1,\n [{file,"src/dcp_replicator.erl"},{line,231}]},\n {dcp_replicator,handle_call,3,\n [{file,"src/dcp_replicator.erl"},{line,115}]},\n {gen_server,handle_msg,5,\n [{file,"gen_server.erl"},{line,585}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},{line,239}]}]},\n {gen_server,call,\n [\'dcp_replicator-default-ns_1@172.23.105.67\',\n {takeover,808},\n infinity]}},\n {gen_server,call,\n [\'replication_manager-default\',\n {change_vbucket_replication,465,undefined},\n infinity]}},\n {gen_server,call,\n [{\'janitor_agent-default\',\'ns_1@172.23.105.20\'},\n {if_rebalance,<0.9984.23>,\n {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n infinity]}}}}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.399Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468330014399, u'type': u'critical'}
|
[2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Control connection to memcached on \'ns_1@172.23.105.20\' disconnected: {{badmatch,\n {error,\n timeout}},\n [{mc_client_binary,\n cmd_vocal_recv,\n 5,\n [{file,\n "src/mc_client_binary.erl"},\n {line,\n 156}]},\n {mc_client_binary,\n select_bucket,\n 2,\n [{file,\n "src/mc_client_binary.erl"},\n {line,\n 351}]},\n {ns_memcached,\n ensure_bucket,\n 2,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 1299}]},\n {ns_memcached,\n handle_info,\n 2,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 748}]},\n {gen_server,\n handle_msg,\n 5,\n [{file,\n "gen_server.erl"},\n {line,\n 604}]},\n {ns_memcached,\n init,\n 1,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 177}]},\n {gen_server,\n init_it,\n 6,\n [{file,\n "gen_server.erl"},\n {line,\n 304}]},\n {proc_lib,\n init_p_do_apply,\n 3,\n [{file,\n "proc_lib.erl"},\n {line,\n 239}]}]}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:24:54.067Z', u'module': u'ns_memcached', u'tstamp': 1468329894067, u'type': u'info'}
|
[2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Bucket "default" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:42.137Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468329762137, u'type': u'info'}
|
[2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@172.23.107.53\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:41.072Z', u'module': u'ns_memcached', u'tstamp': 1468329761072, u'type': u'info'}
|
[2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Started rebalancing bucket default', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.921Z', u'module': u'ns_rebalancer', u'tstamp': 1468329760921, u'type': u'info'}
|
[2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@172.23.105.67','ns_1@172.23.105.20',\n 'ns_1@172.23.107.53'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.887Z', u'module': u'ns_orchestrator', u'tstamp': 1468329760887, u'type': u'info'}
|
[2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 3, u'text': u'Node ns_1@172.23.107.53 joined cluster', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.871Z', u'module': u'ns_cluster', u'tstamp': 1468329760871, u'type': u'info'}
|
[2016-07-12 06:43:22,072] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 1, u'text': u'Couchbase Server has started on web port 8091 on node \'ns_1@172.23.107.53\'. Version: "4.1.2-6066-enterprise".', u'shortText': u'web start ok', u'serverTime': u'2016-07-12T06:22:40.821Z', u'module': u'menelaus_sup', u'tstamp': 1468329760821, u'type': u'info'}
|
[2016-07-12 06:43:22,072] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.67', u'code': 4, u'text': u"Node 'ns_1@172.23.105.67' saw that node 'ns_1@172.23.107.53' came up. Tags: []", u'shortText': u'node up', u'serverTime': u'2016-07-12T06:22:36.518Z', u'module': u'ns_node_disco', u'tstamp': 1468329756518, u'type': u'info'}
|
ERROR
|
[('/usr/lib64/python2.7/threading.py', 784, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib64/python2.7/threading.py', 811, '__bootstrap_inner', 'self.run()'), ('./testrunner.py', 282, 'run', '**self._Thread__kwargs)'), ('/usr/lib64/python2.7/unittest/runner.py', 151, 'run', 'test(result)'), ('/usr/lib64/python2.7/unittest/case.py', 433, '__call__', 'return self.run(*args, **kwds)'), ('/usr/lib64/python2.7/unittest/case.py', 369, 'run', 'testMethod()'), ('pytests/xdcr/rebalanceXDCR.py', 35, 'async_rebalance_in', 'self.dest_cluster.rebalance_in()'), ('pytests/xdcr/xdcrnewbasetests.py', 1839, 'rebalance_in', 'task.result()'), ('lib/tasks/future.py', 160, 'result', 'return self.__get_result()'), ('lib/tasks/future.py', 111, '__get_result', 'print traceback.extract_stack()')]
|
[2016-07-12 06:43:22,075] - [xdcrnewbasetests:2398] INFO - Collecting logs @ 172.23.105.3
|
grabbing cbcollect from 172.23.105.3
|
[2016-07-12 06:43:22,076] - [remote_util:172] INFO - connecting to 172.23.105.3 with username : root password : couchbase ssh_key:
|
[('/usr/lib64/python2.7/threading.py', 784, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib64/python2.7/threading.py', 811, '__bootstrap_inner', 'self.run()'), ('lib/tasks/taskmanager.py', 31, 'run', 'task.step(self)'), ('lib/tasks/task.py', 74, 'step', 'self.check(task_manager)'), ('lib/tasks/task.py', 480, 'check', 'self.set_exception(ex)'), ('lib/tasks/future.py', 264, 'set_exception', 'print traceback.extract_stack()')]
|
Tue Jul 12 06:43:22 2016
|
[2016-07-12 06:43:22,224] - [remote_util:206] INFO - Connected to 172.23.105.3
|
Collecting logs from 172.23.105.3
|
|
[2016-07-12 06:43:22,291] - [rest_client:2605] INFO - Latest logs from UI on 172.23.105.20:
|
[2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 2, u'text': u'Rebalance exited with reason {unexpected_exit,\n {\'EXIT\',<0.16720.23>,\n {{{{{child_interrupted,\n {\'EXIT\',<0.5598.23>,normal}},\n [{dcp_replicator,spawn_and_wait,1,\n [{file,"src/dcp_replicator.erl"},\n {line,231}]},\n {dcp_replicator,handle_call,3,\n [{file,"src/dcp_replicator.erl"},\n {line,115}]},\n {gen_server,handle_msg,5,\n [{file,"gen_server.erl"},{line,585}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},{line,239}]}]},\n {gen_server,call,\n [\'dcp_replicator-default-ns_1@172.23.105.67\',\n {takeover,808},\n infinity]}},\n {gen_server,call,\n [\'replication_manager-default\',\n {change_vbucket_replication,465,undefined},\n infinity]}},\n {gen_server,call,\n [{\'janitor_agent-default\',\n \'ns_1@172.23.105.20\'},\n {if_rebalance,<0.9984.23>,\n {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n infinity]}}}}\n', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.403Z', u'module': u'ns_orchestrator', u'tstamp': 1468330014403, u'type': u'info'}
|
[2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'<0.16626.23> exited with {unexpected_exit,\n {\'EXIT\',<0.16720.23>,\n {{{{{child_interrupted,{\'EXIT\',<0.5598.23>,normal}},\n [{dcp_replicator,spawn_and_wait,1,\n [{file,"src/dcp_replicator.erl"},{line,231}]},\n {dcp_replicator,handle_call,3,\n [{file,"src/dcp_replicator.erl"},{line,115}]},\n {gen_server,handle_msg,5,\n [{file,"gen_server.erl"},{line,585}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},{line,239}]}]},\n {gen_server,call,\n [\'dcp_replicator-default-ns_1@172.23.105.67\',\n {takeover,808},\n infinity]}},\n {gen_server,call,\n [\'replication_manager-default\',\n {change_vbucket_replication,465,undefined},\n infinity]}},\n {gen_server,call,\n [{\'janitor_agent-default\',\'ns_1@172.23.105.20\'},\n {if_rebalance,<0.9984.23>,\n {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n infinity]}}}}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.399Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468330014399, u'type': u'critical'}
|
[2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Control connection to memcached on \'ns_1@172.23.105.20\' disconnected: {{badmatch,\n {error,\n timeout}},\n [{mc_client_binary,\n cmd_vocal_recv,\n 5,\n [{file,\n "src/mc_client_binary.erl"},\n {line,\n 156}]},\n {mc_client_binary,\n select_bucket,\n 2,\n [{file,\n "src/mc_client_binary.erl"},\n {line,\n 351}]},\n {ns_memcached,\n ensure_bucket,\n 2,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 1299}]},\n {ns_memcached,\n handle_info,\n 2,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 748}]},\n {gen_server,\n handle_msg,\n 5,\n [{file,\n "gen_server.erl"},\n {line,\n 604}]},\n {ns_memcached,\n init,\n 1,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 177}]},\n {gen_server,\n init_it,\n 6,\n [{file,\n "gen_server.erl"},\n {line,\n 304}]},\n {proc_lib,\n init_p_do_apply,\n 3,\n [{file,\n "proc_lib.erl"},\n {line,\n 239}]}]}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:24:54.067Z', u'module': u'ns_memcached', u'tstamp': 1468329894067, u'type': u'info'}
|
[2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Bucket "default" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:42.137Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468329762137, u'type': u'info'}
|
[2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@172.23.107.53\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:41.072Z', u'module': u'ns_memcached', u'tstamp': 1468329761072, u'type': u'info'}
|
[2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Started rebalancing bucket default', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.921Z', u'module': u'ns_rebalancer', u'tstamp': 1468329760921, u'type': u'info'}
|
[2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@172.23.105.67','ns_1@172.23.105.20',\n 'ns_1@172.23.107.53'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.887Z', u'module': u'ns_orchestrator', u'tstamp': 1468329760887, u'type': u'info'}
|
[2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 3, u'text': u'Node ns_1@172.23.107.53 joined cluster', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.871Z', u'module': u'ns_cluster', u'tstamp': 1468329760871, u'type': u'info'}
|
[2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 1, u'text': u'Couchbase Server has started on web port 8091 on node \'ns_1@172.23.107.53\'. Version: "4.1.2-6066-enterprise".', u'shortText': u'web start ok', u'serverTime': u'2016-07-12T06:22:40.821Z', u'module': u'menelaus_sup', u'tstamp': 1468329760821, u'type': u'info'}
|
[2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.67', u'code': 4, u'text': u"Node 'ns_1@172.23.105.67' saw that node 'ns_1@172.23.107.53' came up. Tags: []", u'shortText': u'node up', u'serverTime': u'2016-07-12T06:22:36.518Z', u'module': u'ns_node_disco', u'tstamp': 1468329756518, u'type': u'info'}
|
[('/usr/lib64/python2.7/threading.py', 784, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib64/python2.7/threading.py', 811, '__bootstrap_inner', 'self.run()'), ('lib/tasks/taskmanager.py', 31, 'run', 'task.step(self)'), ('lib/tasks/task.py', 74, 'step', 'self.check(task_manager)'), ('lib/tasks/task.py', 501, 'check', '"seems like rebalance hangs. please check logs!"))'), ('lib/tasks/future.py', 264, 'set_exception', 'print traceback.extract_stack()')]
|
attaching logs and console output