Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-20155

[XDCR] rebalance exited with error

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      ./testrunner -i INI_FILE get-cbcollect-info=True,get-logs=False,stop-on-failure=False,get-coredumps=True,demand_encryption=1,fail_on_errors=1,GROUP=ALL -t xdcr.rebalanceXDCR.Rebalance.async_rebalance_in,items=100000,rdirection=bidirection,ctopology=chain,update=C1-C2,delete=C1-C2,expires=240,rebalance=C1-C2,num_rebalance=1,GROUP=P1

      This test failed with following error during rebalance:

      [2016-07-12 06:42:11,743] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
      [2016-07-12 06:42:21,763] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
      [2016-07-12 06:42:31,784] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
      [2016-07-12 06:42:41,805] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
      [2016-07-12 06:42:51,825] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
      [2016-07-12 06:43:01,847] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
      [2016-07-12 06:43:11,869] - [rest_client:1418] INFO - rebalance percentage : 21.28 %
      [2016-07-12 06:43:21,889] - [rest_client:1401] ERROR - {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed
      [2016-07-12 06:43:22,070] - [rest_client:2605] INFO - Latest logs from UI on 172.23.105.20:
      [2016-07-12 06:43:22,070] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 2, u'text': u'Rebalance exited with reason {unexpected_exit,\n                              {\'EXIT\',<0.16720.23>,\n                               {{{{{child_interrupted,\n                                    {\'EXIT\',<0.5598.23>,normal}},\n                                   [{dcp_replicator,spawn_and_wait,1,\n                                     [{file,"src/dcp_replicator.erl"},\n                                      {line,231}]},\n                                    {dcp_replicator,handle_call,3,\n                                     [{file,"src/dcp_replicator.erl"},\n                                      {line,115}]},\n                                    {gen_server,handle_msg,5,\n                                     [{file,"gen_server.erl"},{line,585}]},\n                                    {proc_lib,init_p_do_apply,3,\n                                     [{file,"proc_lib.erl"},{line,239}]}]},\n                                  {gen_server,call,\n                                   [\'dcp_replicator-default-ns_1@172.23.105.67\',\n                                    {takeover,808},\n                                    infinity]}},\n                                 {gen_server,call,\n                                  [\'replication_manager-default\',\n                                   {change_vbucket_replication,465,undefined},\n                                   infinity]}},\n                                {gen_server,call,\n                                 [{\'janitor_agent-default\',\n                                   \'ns_1@172.23.105.20\'},\n                                  {if_rebalance,<0.9984.23>,\n                                   {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n                                  infinity]}}}}\n', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.403Z', u'module': u'ns_orchestrator', u'tstamp': 1468330014403, u'type': u'info'}
      [2016-07-12 06:43:22,070] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'<0.16626.23> exited with {unexpected_exit,\n                          {\'EXIT\',<0.16720.23>,\n                           {{{{{child_interrupted,{\'EXIT\',<0.5598.23>,normal}},\n                               [{dcp_replicator,spawn_and_wait,1,\n                                 [{file,"src/dcp_replicator.erl"},{line,231}]},\n                                {dcp_replicator,handle_call,3,\n                                 [{file,"src/dcp_replicator.erl"},{line,115}]},\n                                {gen_server,handle_msg,5,\n                                 [{file,"gen_server.erl"},{line,585}]},\n                                {proc_lib,init_p_do_apply,3,\n                                 [{file,"proc_lib.erl"},{line,239}]}]},\n                              {gen_server,call,\n                               [\'dcp_replicator-default-ns_1@172.23.105.67\',\n                                {takeover,808},\n                                infinity]}},\n                             {gen_server,call,\n                              [\'replication_manager-default\',\n                               {change_vbucket_replication,465,undefined},\n                               infinity]}},\n                            {gen_server,call,\n                             [{\'janitor_agent-default\',\'ns_1@172.23.105.20\'},\n                              {if_rebalance,<0.9984.23>,\n                               {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n                              infinity]}}}}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.399Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468330014399, u'type': u'critical'}
      [2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Control connection to memcached on \'ns_1@172.23.105.20\' disconnected: {{badmatch,\n                                                                        {error,\n                                                                         timeout}},\n                                                                       [{mc_client_binary,\n                                                                         cmd_vocal_recv,\n                                                                         5,\n                                                                         [{file,\n                                                                           "src/mc_client_binary.erl"},\n                                                                          {line,\n                                                                           156}]},\n                                                                        {mc_client_binary,\n                                                                         select_bucket,\n                                                                         2,\n                                                                         [{file,\n                                                                           "src/mc_client_binary.erl"},\n                                                                          {line,\n                                                                           351}]},\n                                                                        {ns_memcached,\n                                                                         ensure_bucket,\n                                                                         2,\n                                                                         [{file,\n                                                                           "src/ns_memcached.erl"},\n                                                                          {line,\n                                                                           1299}]},\n                                                                        {ns_memcached,\n                                                                         handle_info,\n                                                                         2,\n                                                                         [{file,\n                                                                           "src/ns_memcached.erl"},\n                                                                          {line,\n                                                                           748}]},\n                                                                        {gen_server,\n                                                                         handle_msg,\n                                                                         5,\n                                                                         [{file,\n                                                                           "gen_server.erl"},\n                                                                          {line,\n                                                                           604}]},\n                                                                        {ns_memcached,\n                                                                         init,\n                                                                         1,\n                                                                         [{file,\n                                                                           "src/ns_memcached.erl"},\n                                                                          {line,\n                                                                           177}]},\n                                                                        {gen_server,\n                                                                         init_it,\n                                                                         6,\n                                                                         [{file,\n                                                                           "gen_server.erl"},\n                                                                          {line,\n                                                                           304}]},\n                                                                        {proc_lib,\n                                                                         init_p_do_apply,\n                                                                         3,\n                                                                         [{file,\n                                                                           "proc_lib.erl"},\n                                                                          {line,\n                                                                           239}]}]}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:24:54.067Z', u'module': u'ns_memcached', u'tstamp': 1468329894067, u'type': u'info'}
      [2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Bucket "default" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:42.137Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468329762137, u'type': u'info'}
      [2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@172.23.107.53\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:41.072Z', u'module': u'ns_memcached', u'tstamp': 1468329761072, u'type': u'info'}
      [2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Started rebalancing bucket default', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.921Z', u'module': u'ns_rebalancer', u'tstamp': 1468329760921, u'type': u'info'}
      [2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@172.23.105.67','ns_1@172.23.105.20',\n                                 'ns_1@172.23.107.53'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.887Z', u'module': u'ns_orchestrator', u'tstamp': 1468329760887, u'type': u'info'}
      [2016-07-12 06:43:22,071] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 3, u'text': u'Node ns_1@172.23.107.53 joined cluster', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.871Z', u'module': u'ns_cluster', u'tstamp': 1468329760871, u'type': u'info'}
      [2016-07-12 06:43:22,072] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 1, u'text': u'Couchbase Server has started on web port 8091 on node \'ns_1@172.23.107.53\'. Version: "4.1.2-6066-enterprise".', u'shortText': u'web start ok', u'serverTime': u'2016-07-12T06:22:40.821Z', u'module': u'menelaus_sup', u'tstamp': 1468329760821, u'type': u'info'}
      [2016-07-12 06:43:22,072] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.67', u'code': 4, u'text': u"Node 'ns_1@172.23.105.67' saw that node 'ns_1@172.23.107.53' came up. Tags: []", u'shortText': u'node up', u'serverTime': u'2016-07-12T06:22:36.518Z', u'module': u'ns_node_disco', u'tstamp': 1468329756518, u'type': u'info'}
      ERROR
      [('/usr/lib64/python2.7/threading.py', 784, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib64/python2.7/threading.py', 811, '__bootstrap_inner', 'self.run()'), ('./testrunner.py', 282, 'run', '**self._Thread__kwargs)'), ('/usr/lib64/python2.7/unittest/runner.py', 151, 'run', 'test(result)'), ('/usr/lib64/python2.7/unittest/case.py', 433, '__call__', 'return self.run(*args, **kwds)'), ('/usr/lib64/python2.7/unittest/case.py', 369, 'run', 'testMethod()'), ('pytests/xdcr/rebalanceXDCR.py', 35, 'async_rebalance_in', 'self.dest_cluster.rebalance_in()'), ('pytests/xdcr/xdcrnewbasetests.py', 1839, 'rebalance_in', 'task.result()'), ('lib/tasks/future.py', 160, 'result', 'return self.__get_result()'), ('lib/tasks/future.py', 111, '__get_result', 'print traceback.extract_stack()')]
      [2016-07-12 06:43:22,075] - [xdcrnewbasetests:2398] INFO - Collecting logs @ 172.23.105.3
      grabbing cbcollect from 172.23.105.3
      [2016-07-12 06:43:22,076] - [remote_util:172] INFO - connecting to 172.23.105.3 with username : root password : couchbase ssh_key: 
      [('/usr/lib64/python2.7/threading.py', 784, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib64/python2.7/threading.py', 811, '__bootstrap_inner', 'self.run()'), ('lib/tasks/taskmanager.py', 31, 'run', 'task.step(self)'), ('lib/tasks/task.py', 74, 'step', 'self.check(task_manager)'), ('lib/tasks/task.py', 480, 'check', 'self.set_exception(ex)'), ('lib/tasks/future.py', 264, 'set_exception', 'print traceback.extract_stack()')]
      Tue Jul 12 06:43:22 2016
      [2016-07-12 06:43:22,224] - [remote_util:206] INFO - Connected to 172.23.105.3
      Collecting logs from 172.23.105.3
       
      [2016-07-12 06:43:22,291] - [rest_client:2605] INFO - Latest logs from UI on 172.23.105.20:
      [2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 2, u'text': u'Rebalance exited with reason {unexpected_exit,\n                              {\'EXIT\',<0.16720.23>,\n                               {{{{{child_interrupted,\n                                    {\'EXIT\',<0.5598.23>,normal}},\n                                   [{dcp_replicator,spawn_and_wait,1,\n                                     [{file,"src/dcp_replicator.erl"},\n                                      {line,231}]},\n                                    {dcp_replicator,handle_call,3,\n                                     [{file,"src/dcp_replicator.erl"},\n                                      {line,115}]},\n                                    {gen_server,handle_msg,5,\n                                     [{file,"gen_server.erl"},{line,585}]},\n                                    {proc_lib,init_p_do_apply,3,\n                                     [{file,"proc_lib.erl"},{line,239}]}]},\n                                  {gen_server,call,\n                                   [\'dcp_replicator-default-ns_1@172.23.105.67\',\n                                    {takeover,808},\n                                    infinity]}},\n                                 {gen_server,call,\n                                  [\'replication_manager-default\',\n                                   {change_vbucket_replication,465,undefined},\n                                   infinity]}},\n                                {gen_server,call,\n                                 [{\'janitor_agent-default\',\n                                   \'ns_1@172.23.105.20\'},\n                                  {if_rebalance,<0.9984.23>,\n                                   {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n                                  infinity]}}}}\n', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.403Z', u'module': u'ns_orchestrator', u'tstamp': 1468330014403, u'type': u'info'}
      [2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'<0.16626.23> exited with {unexpected_exit,\n                          {\'EXIT\',<0.16720.23>,\n                           {{{{{child_interrupted,{\'EXIT\',<0.5598.23>,normal}},\n                               [{dcp_replicator,spawn_and_wait,1,\n                                 [{file,"src/dcp_replicator.erl"},{line,231}]},\n                                {dcp_replicator,handle_call,3,\n                                 [{file,"src/dcp_replicator.erl"},{line,115}]},\n                                {gen_server,handle_msg,5,\n                                 [{file,"gen_server.erl"},{line,585}]},\n                                {proc_lib,init_p_do_apply,3,\n                                 [{file,"proc_lib.erl"},{line,239}]}]},\n                              {gen_server,call,\n                               [\'dcp_replicator-default-ns_1@172.23.105.67\',\n                                {takeover,808},\n                                infinity]}},\n                             {gen_server,call,\n                              [\'replication_manager-default\',\n                               {change_vbucket_replication,465,undefined},\n                               infinity]}},\n                            {gen_server,call,\n                             [{\'janitor_agent-default\',\'ns_1@172.23.105.20\'},\n                              {if_rebalance,<0.9984.23>,\n                               {dcp_takeover,\'ns_1@172.23.105.67\',808}},\n                              infinity]}}}}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:26:54.399Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468330014399, u'type': u'critical'}
      [2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Control connection to memcached on \'ns_1@172.23.105.20\' disconnected: {{badmatch,\n                                                                        {error,\n                                                                         timeout}},\n                                                                       [{mc_client_binary,\n                                                                         cmd_vocal_recv,\n                                                                         5,\n                                                                         [{file,\n                                                                           "src/mc_client_binary.erl"},\n                                                                          {line,\n                                                                           156}]},\n                                                                        {mc_client_binary,\n                                                                         select_bucket,\n                                                                         2,\n                                                                         [{file,\n                                                                           "src/mc_client_binary.erl"},\n                                                                          {line,\n                                                                           351}]},\n                                                                        {ns_memcached,\n                                                                         ensure_bucket,\n                                                                         2,\n                                                                         [{file,\n                                                                           "src/ns_memcached.erl"},\n                                                                          {line,\n                                                                           1299}]},\n                                                                        {ns_memcached,\n                                                                         handle_info,\n                                                                         2,\n                                                                         [{file,\n                                                                           "src/ns_memcached.erl"},\n                                                                          {line,\n                                                                           748}]},\n                                                                        {gen_server,\n                                                                         handle_msg,\n                                                                         5,\n                                                                         [{file,\n                                                                           "gen_server.erl"},\n                                                                          {line,\n                                                                           604}]},\n                                                                        {ns_memcached,\n                                                                         init,\n                                                                         1,\n                                                                         [{file,\n                                                                           "src/ns_memcached.erl"},\n                                                                          {line,\n                                                                           177}]},\n                                                                        {gen_server,\n                                                                         init_it,\n                                                                         6,\n                                                                         [{file,\n                                                                           "gen_server.erl"},\n                                                                          {line,\n                                                                           304}]},\n                                                                        {proc_lib,\n                                                                         init_p_do_apply,\n                                                                         3,\n                                                                         [{file,\n                                                                           "proc_lib.erl"},\n                                                                          {line,\n                                                                           239}]}]}', u'shortText': u'message', u'serverTime': u'2016-07-12T06:24:54.067Z', u'module': u'ns_memcached', u'tstamp': 1468329894067, u'type': u'info'}
      [2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Bucket "default" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:42.137Z', u'module': u'ns_vbucket_mover', u'tstamp': 1468329762137, u'type': u'info'}
      [2016-07-12 06:43:22,291] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@172.23.107.53\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:41.072Z', u'module': u'ns_memcached', u'tstamp': 1468329761072, u'type': u'info'}
      [2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 0, u'text': u'Started rebalancing bucket default', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.921Z', u'module': u'ns_rebalancer', u'tstamp': 1468329760921, u'type': u'info'}
      [2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.20', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@172.23.105.67','ns_1@172.23.105.20',\n                                 'ns_1@172.23.107.53'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.887Z', u'module': u'ns_orchestrator', u'tstamp': 1468329760887, u'type': u'info'}
      [2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 3, u'text': u'Node ns_1@172.23.107.53 joined cluster', u'shortText': u'message', u'serverTime': u'2016-07-12T06:22:40.871Z', u'module': u'ns_cluster', u'tstamp': 1468329760871, u'type': u'info'}
      [2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.107.53', u'code': 1, u'text': u'Couchbase Server has started on web port 8091 on node \'ns_1@172.23.107.53\'. Version: "4.1.2-6066-enterprise".', u'shortText': u'web start ok', u'serverTime': u'2016-07-12T06:22:40.821Z', u'module': u'menelaus_sup', u'tstamp': 1468329760821, u'type': u'info'}
      [2016-07-12 06:43:22,292] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.105.67', u'code': 4, u'text': u"Node 'ns_1@172.23.105.67' saw that node 'ns_1@172.23.107.53' came up. Tags: []", u'shortText': u'node up', u'serverTime': u'2016-07-12T06:22:36.518Z', u'module': u'ns_node_disco', u'tstamp': 1468329756518, u'type': u'info'}
      [('/usr/lib64/python2.7/threading.py', 784, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib64/python2.7/threading.py', 811, '__bootstrap_inner', 'self.run()'), ('lib/tasks/taskmanager.py', 31, 'run', 'task.step(self)'), ('lib/tasks/task.py', 74, 'step', 'self.check(task_manager)'), ('lib/tasks/task.py', 501, 'check', '"seems like rebalance hangs. please check logs!"))'), ('lib/tasks/future.py', 264, 'set_exception', 'print traceback.extract_stack()')]
      

      attaching logs and console output

      Attachments

        1. 172.23.105.102-20160712-0708-diag.zip
          21.26 MB
        2. 172.23.105.20-20160712-0647-diag.zip
          29.85 MB
        3. 172.23.105.3-20160712-0643-diag.zip
          26.72 MB
        4. 172.23.105.4-20160712-0645-diag.zip
          22.13 MB
        5. 172.23.105.67-20160712-0705-diag.zip
          24.19 MB
        6. 172.23.107.53-20160712-0707-diag.zip
          10.02 MB
        7. error.rtf
          69 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            arunkumar Arunkumar Senthilnathan (Inactive)
            arunkumar Arunkumar Senthilnathan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty