Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-10757

Error message shows in rebalance failed should be improved to more closure to the issue.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • feature-backlog
    • 3.0
    • ns_server
    • Security Level: Public
    • None
    • Build 3.0 - 540
    • Untriaged
    • Centos 32-bit
    • Unknown

    Description

      [Jenkin execution]

      http://qa.hq.northscale.net/job/ubuntu_x64--37_02--biXDCR-P1/10/consoleFull

      [Notes]
      Test is failed during setup itself. There is no changes done on the node.

      [Diagnostic logs collected immediately after test are]

      https://s3.amazonaws.com/bugdb/jira/MB-10757/29c16556/test_8.zip

      [All Logs]
      10.3.121.56 : https://s3.amazonaws.com/bugdb/jira/MB-10757/b663fc93/10.3.121.56-432014-228-diag.zip
      10.3.121.57 : https://s3.amazonaws.com/bugdb/jira/MB-10757/78851953/10.3.121.57-432014-2210-diag.zip
      10.3.4.244 : https://s3.amazonaws.com/bugdb/jira/MB-10757/a6649c42/10.3.4.244-432014-229-diag.zip
      10.3.121.59 : https://s3.amazonaws.com/bugdb/jira/MB-10757/95ff76ce/10.3.121.59-432014-2212-diag.zip
      10.3.121.60 : https://s3.amazonaws.com/bugdb/jira/MB-10757/9ab9fd4f/10.3.121.60-432014-2211-diag.zip
      10.3.121.61 : https://s3.amazonaws.com/bugdb/jira/MB-10757/7ba300cf/10.3.121.61-432014-2213-diag.zip

      [Test logs] -> Test is executed with 128 vbuckets.

      ./testrunner -i /tmp/ubuntu-64-2.0-biXDCR-all.ini get-logs=True -t xdcr.biXDCR.bidirectional.load_with_async_ops_and_joint_sets_with_warmup_master,items=10000,ctopology=chain,rdirection=bidirection,standard_buckets=1,doc-ops=create-delete,doc-ops-dest=create-update-delete,upd=30,del=30,replication_type=xmem,GROUP=P0;xmem

      Test Input params:

      {'doc-ops': 'create-delete', 'GROUP': 'P0;xmem', 'case_number': 8, 'items': '10000', 'upd': '30', 'standard_buckets': '1', 'conf_file': 'conf/py-xdcr-bidirectional.conf', 'num_nodes': 6, 'cluster_name': 'ubuntu-64-2.0-biXDCR-all', 'ctopology': 'chain', 'rdirection': 'bidirection', 'del': '30', 'ini': '/tmp/ubuntu-64-2.0-biXDCR-all.ini', 'doc-ops-dest': 'create-update-delete', 'replication_type': 'xmem', 'get-logs': 'True', 'spec': 'py-xdcr-bidirectional'}

      [2014-04-03 10:07:57,574] - [xdcrbasetests:216] INFO - Initializing input parameters started...
      [2014-04-03 10:07:57,575] - [xdcrbasetests:1031] INFO - Setting xdcrFailureRestartInterval to 1 ..
      [2014-04-03 10:07:57,588] - [rest_client:1602] INFO - Update internal setting xdcrFailureRestartInterval=1
      [2014-04-03 10:07:57,606] - [rest_client:1602] INFO - Update internal setting xdcrFailureRestartInterval=1
      [2014-04-03 10:07:57,608] - [xdcrbasetests:323] INFO - Initializing input parameters completed.
      [2014-04-03 10:07:57,611] - [xdcrbasetests:92] INFO - ============== XDCRbasetests setup was started for test #8 load_with_async_ops_and_joint_sets_with_warmup_master==============
      [2014-04-03 10:07:57,654] - [xdcrbasetests:434] INFO - cleanup cluster1: [ip:10.3.121.56 port:8091 ssh_username:root, ip:10.3.121.57 port:8091 ssh_username:root, ip:10.3.4.244 port:8091 ssh_username:root]
      [2014-04-03 10:07:57,682] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.121.56
      [2014-04-03 10:07:57,712] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.121.56:8091
      [2014-04-03 10:07:57,720] - [cluster_helper:80] INFO - ns_server @ 10.3.121.56:8091 is running
      [2014-04-03 10:07:57,732] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.121.57
      [2014-04-03 10:07:57,766] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.121.57:8091
      [2014-04-03 10:07:57,774] - [cluster_helper:80] INFO - ns_server @ 10.3.121.57:8091 is running
      [2014-04-03 10:07:57,801] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.4.244
      [2014-04-03 10:07:57,828] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.4.244:8091
      [2014-04-03 10:07:57,841] - [cluster_helper:80] INFO - ns_server @ 10.3.4.244:8091 is running
      [2014-04-03 10:07:57,842] - [xdcrbasetests:434] INFO - cleanup cluster2: [ip:10.3.121.59 port:8091 ssh_username:root, ip:10.3.121.60 port:8091 ssh_username:root, ip:10.3.121.61 port:8091 ssh_username:root]
      [2014-04-03 10:07:57,876] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.121.59
      [2014-04-03 10:07:57,916] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.121.59:8091
      [2014-04-03 10:07:57,928] - [cluster_helper:80] INFO - ns_server @ 10.3.121.59:8091 is running
      [2014-04-03 10:07:57,954] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.121.60
      [2014-04-03 10:07:57,987] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.121.60:8091
      [2014-04-03 10:07:57,997] - [cluster_helper:80] INFO - ns_server @ 10.3.121.60:8091 is running
      [2014-04-03 10:07:58,015] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.121.61
      [2014-04-03 10:07:58,046] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.121.61:8091
      [2014-04-03 10:07:58,057] - [cluster_helper:80] INFO - ns_server @ 10.3.121.61:8091 is running
      [2014-04-03 10:07:58,626] - [rest_client:1670] INFO - Consistent-views during rebalance was set as indexAwareRebalanceDisabled=true
      [2014-04-03 10:07:58,627] - [rest_client:719] INFO - settings/web params on 10.3.121.56:8091:username=Administrator&password=password&port=8091
      [2014-04-03 10:07:58,650] - [rest_client:739] INFO - pools/default params : memoryQuota=2114&username=Administrator&password=password
      [2014-04-03 10:07:58,671] - [rest_client:1670] INFO - Consistent-views during rebalance was set as indexAwareRebalanceDisabled=true
      [2014-04-03 10:07:58,672] - [rest_client:719] INFO - settings/web params on 10.3.121.57:8091:username=Administrator&password=password&port=8091
      [2014-04-03 10:07:58,694] - [rest_client:739] INFO - pools/default params : memoryQuota=2114&username=Administrator&password=password
      [2014-04-03 10:07:58,715] - [rest_client:1670] INFO - Consistent-views during rebalance was set as indexAwareRebalanceDisabled=true
      [2014-04-03 10:07:58,716] - [rest_client:719] INFO - settings/web params on 10.3.4.244:8091:username=Administrator&password=password&port=8091
      [2014-04-03 10:07:58,736] - [rest_client:739] INFO - pools/default params : memoryQuota=2114&username=Administrator&password=password
      [2014-04-03 10:07:59,745] - [task:279] INFO - adding node 10.3.121.57:8091 to cluster
      [2014-04-03 10:07:59,745] - [rest_client:888] INFO - adding remote node @10.3.121.57:8091 to this cluster @10.3.121.56:8091
      [2014-04-03 10:08:29,796] - [rest_client:702] ERROR - http://10.3.121.56:8091/controller/addNode error 400 reason: unknown ["Join completion call failed. Timeout connecting to \"10.3.121.57\" on port \"8091\". This could be due to an incorrect host/port combination or a firewall in place between the servers."]
      [2014-04-03 10:08:29,796] - [rest_client:918] ERROR - add_node error : ["Join completion call failed. Timeout connecting to \"10.3.121.57\" on port \"8091\". This could be due to an incorrect host/port combination or a firewall in place between the servers."]
      [('/usr/lib/python2.7/threading.py', 524, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib/python2.7/threading.py', 551, '__bootstrap_inner', 'self.run()'), ('lib/tasks/taskmanager.py', 31, 'run', 'task.step(self)'), ('lib/tasks/task.py', 56, 'step', 'self.execute(task_manager)'), ('lib/tasks/task.py', 274, 'execute', 'self.set_exception(e)'), ('lib/tasks/future.py', 264, 'set_exception', 'print traceback.extract_stack()')]
      Thu Apr 3 10:08:29 2014
      [('./testrunner', 330, '<module>', 'result = unittest.TextTestRunner(verbosity=2).run(suite)'), ('/usr/lib/python2.7/unittest/runner.py', 151, 'run', 'test(result)'), ('/usr/lib/python2.7/unittest/suite.py', 70, '_call', 'return self.run(*args, **kwds)'), ('/usr/lib/python2.7/unittest/suite.py', 108, 'run', 'test(result)'), ('/usr/lib/python2.7/unittest/case.py', 391, 'call', 'return self.run(*args, **kwds)'), ('/usr/lib/python2.7/unittest/case.py', 318, 'run', 'self.setUp()'), ('pytests/xdcr/biXDCR.py', 14, 'setUp', 'super(bidirectional, self).setUp()'), ('pytests/xdcr/xdcrbasetests.py', 96, 'setUp', 'self._init_clusters(self._disabled_consistent_view)'), ('pytests/xdcr/xdcrbasetests.py', 410, '_init_clusters', 'self._setup_cluster(self._clusters_dic[key], disabled_consistent_view)'), ('pytests/xdcr/xdcrbasetests.py', 463, '_setup_cluster', 'self.cluster.async_rebalance(nodes, nodes[1:], []).result()'), ('lib/tasks/future.py', 160, 'result', 'return self.get_result()'), ('lib/tasks/future.py', 111, '_get_result', 'print traceback.extract_stack()')]
      [2014-04-03 10:08:29,797] - [xdcrbasetests:124] ERROR -
      [2014-04-03 10:08:29,797] - [xdcrbasetests:125] ERROR - Error while setting up clusters: (<class 'membase.api.exception.AddNodeException'>, AddNodeException(), <traceback object at 0x1d48f38>)
      [2014-04-03 10:08:29,798] - [xdcrbasetests:134] INFO - ============== XDCRbasetests stats for test #8 load_with_async_ops_and_joint_sets_with_warmup_master ==============
      Cluster instance shutdown with force
      [2014-04-03 10:08:29,798] - [xdcrbasetests:452] INFO - Error while cleaning broken setup.
      ERROR

      ======================================================================
      ERROR: load_with_async_ops_and_joint_sets_with_warmup_master (xdcr.biXDCR.bidirectional)
      ----------------------------------------------------------------------
      Traceback (most recent call last):
      File "pytests/xdcr/biXDCR.py", line 14, in setUp
      super(bidirectional, self).setUp()
      File "pytests/xdcr/xdcrbasetests.py", line 96, in setUp
      self._init_clusters(self._disabled_consistent_view)
      File "pytests/xdcr/xdcrbasetests.py", line 410, in _init_clusters
      self._setup_cluster(self._clusters_dic[key], disabled_consistent_view)
      File "pytests/xdcr/xdcrbasetests.py", line 463, in _setup_cluster
      self.cluster.async_rebalance(nodes, nodes[1:], []).result()
      File "lib/tasks/future.py", line 160, in result
      return self.__get_result()
      File "lib/tasks/future.py", line 112, in __get_result
      raise self._exception
      AddNodeException: Error adding node: 10.3.121.57 to the cluster:10.3.121.56 - ["Join completion call failed. Timeout connecting to \"10.3.121.57\" on port \"8091\". This could be due to an incorrect host/port combination or a firewall in place between the servers."]

      ----------------------------------------------------------------------
      Ran 1 test in 32.225s

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Aliaksey Artamonau Aliaksey Artamonau (Inactive)
            sangharsh Sangharsh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty