Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-10759

controller/rebalance failed when invoked with parameters: password=password&ejectedNodes=ns_1%4010.3.4.177%2Cns_1%4010.3.3.208&user=Administrator&knownNodes=ns_1%4010.3.121.62%2Cns_1%4010.3.4.177%2Cns_1%4010.3.3.208

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • 3.0
    • 3.0
    • None
    • Security Level: Public
    • None
    • Build 3.0-540
    • Untriaged
    • Centos 32-bit
    • Unknown

    Description

      [Jenkins]
      http://qa.hq.northscale.net/job/centos_x64--31_02--uniXDCR_SSL-P1/19/consoleFull

      [Test Logs]
      ./testrunner -i /tmp/ubuntu-64-2.0-uniXDCR.ini GROUP=CHAIN,items=50000,demand_encryption=1 -t xdcr.uniXDCR.unidirectional.load_with_async_ops,items=100000,rdirection=unidirection,ctopology=chain,doc-ops=delete-delete,GROUP=CHAIN;P1

      Test Input params:

      {'doc-ops': 'delete-delete', 'GROUP': 'CHAIN', 'demand_encryption': '1', 'items': '50000', 'conf_file': 'conf/py-xdcr-unidirectional.conf', 'num_nodes': 8, 'cluster_name': 'ubuntu-64-2.0-uniXDCR', 'ctopology': 'chain', 'rdirection': 'unidirection', 'ini': '/tmp/ubuntu-64-2.0-uniXDCR.ini', 'case_number': 2, 'spec': 'py-xdcr-unidirectional'}

      [2014-04-03 13:10:13,147] - [xdcrbasetests:216] INFO - Initializing input parameters started...
      [2014-04-03 13:10:13,148] - [xdcrbasetests:1031] INFO - Setting xdcrFailureRestartInterval to 1 ..
      [2014-04-03 13:10:13,159] - [rest_client:1602] INFO - Update internal setting xdcrFailureRestartInterval=1
      [2014-04-03 13:10:13,160] - [xdcrbasetests:323] INFO - Initializing input parameters completed.
      [2014-04-03 13:10:13,162] - [xdcrbasetests:92] INFO - ============== XDCRbasetests setup was started for test #2 load_with_async_ops==============
      [2014-04-03 13:10:13,194] - [xdcrbasetests:434] INFO - cleanup cluster1: [ip:10.3.121.65 port:8091 ssh_username:root, ip:10.3.3.207 port:8091 ssh_username:root, ip:10.3.3.209 port:8091 ssh_username:root, ip:10.3.3.210 port:8091 ssh_username:root]
      [2014-04-03 13:10:13,239] - [bucket_helper:137] INFO - deleting existing buckets [u'default', u'sasl_bucket_1'] on 10.3.121.65
      [2014-04-03 13:10:13,240] - [bucket_helper:139] INFO - remove bucket default ...
      [2014-04-03 13:10:13,722] - [bucket_helper:153] INFO - deleted bucket : default from 10.3.121.65
      [2014-04-03 13:10:13,723] - [bucket_helper:226] INFO - waiting for bucket deletion to complete....
      [2014-04-03 13:10:13,739] - [rest_client:109] INFO - existing buckets : [u'sasl_bucket_1']
      [2014-04-03 13:10:13,740] - [bucket_helper:139] INFO - remove bucket sasl_bucket_1 ...
      [2014-04-03 13:10:14,901] - [bucket_helper:153] INFO - deleted bucket : sasl_bucket_1 from 10.3.121.65
      [2014-04-03 13:10:14,902] - [bucket_helper:226] INFO - waiting for bucket deletion to complete....
      [2014-04-03 13:10:14,908] - [rest_client:109] INFO - existing buckets : []
      [2014-04-03 13:10:14,939] - [cluster_helper:253] INFO - rebalancing all nodes in order to remove nodes
      [2014-04-03 13:10:14,943] - [rest_client:986] INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.3.3.210%2Cns_1%4010.3.3.209%2Cns_1%4010.3.3.207&user=Administrator&knownNodes=ns_1%4010.3.121.65%2Cns_1%4010.3.3.210%2Cns_1%4010.3.3.209%2Cns_1%4010.3.3.207
      [2014-04-03 13:10:14,950] - [rest_client:990] INFO - rebalance operation started
      [2014-04-03 13:10:14,954] - [rest_client:1091] INFO - rebalance percentage : 0 %
      [2014-04-03 13:10:24,971] - [rest_client:1047] INFO - rebalance progress took 10.0207719803 seconds
      [2014-04-03 13:10:24,972] - [rest_client:1048] INFO - sleep for 10 seconds after rebalance...
      [2014-04-03 13:10:35,005] - [cluster_helper:288] INFO - removed all the nodes from cluster associated with ip:10.3.121.65 port:8091 ssh_username:root ? [(u'ns_1@10.3.3.210', 8091), (u'ns_1@10.3.3.209', 8091), (u'ns_1@10.3.3.207', 8091)]
      [2014-04-03 13:10:35,012] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.121.65:8091
      [2014-04-03 13:10:35,018] - [cluster_helper:80] INFO - ns_server @ 10.3.121.65:8091 is running
      [2014-04-03 13:10:35,027] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.3.207
      [2014-04-03 13:10:35,052] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.3.207:8091
      [2014-04-03 13:10:35,058] - [cluster_helper:80] INFO - ns_server @ 10.3.3.207:8091 is running
      [2014-04-03 13:10:35,070] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.3.209
      [2014-04-03 13:10:35,091] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.3.209:8091
      [2014-04-03 13:10:35,096] - [cluster_helper:80] INFO - ns_server @ 10.3.3.209:8091 is running
      [2014-04-03 13:10:35,105] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.3.210
      [2014-04-03 13:10:35,127] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.3.210:8091
      [2014-04-03 13:10:35,132] - [cluster_helper:80] INFO - ns_server @ 10.3.3.210:8091 is running
      [2014-04-03 13:10:35,132] - [xdcrbasetests:434] INFO - cleanup cluster2: [ip:10.3.4.177 port:8091 ssh_username:root, ip:10.3.3.208 port:8091 ssh_username:root, ip:10.3.121.62 port:8091 ssh_username:root, ip:10.3.2.204 port:8091 ssh_username:root]
      [2014-04-03 13:10:35,149] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.4.177
      [2014-04-03 13:10:35,169] - [cluster_helper:253] INFO - rebalancing all nodes in order to remove nodes
      [2014-04-03 13:10:35,172] - [rest_client:986] INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.3.3.208&user=Administrator&knownNodes=ns_1%4010.3.4.177%2Cns_1%4010.3.3.208
      [2014-04-03 13:10:35,181] - [rest_client:990] INFO - rebalance operation started
      [2014-04-03 13:10:43,351] - [rest_client:1047] INFO - rebalance progress took 8.16891503334 seconds
      [2014-04-03 13:10:43,352] - [rest_client:1048] INFO - sleep for 8.16891503334 seconds after rebalance...
      [2014-04-03 13:10:51,540] - [cluster_helper:288] INFO - removed all the nodes from cluster associated with ip:10.3.4.177 port:8091 ssh_username:root ? [(u'ns_1@10.3.3.208', 8091)]
      [2014-04-03 13:10:51,546] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.4.177:8091
      [2014-04-03 13:10:51,551] - [cluster_helper:80] INFO - ns_server @ 10.3.4.177:8091 is running
      [2014-04-03 13:10:51,566] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.3.208
      [2014-04-03 13:10:51,585] - [cluster_helper:78] INFO - waiting for ns_server @ 10.3.3.208:8091
      [2014-04-03 13:10:51,590] - [cluster_helper:80] INFO - ns_server @ 10.3.3.208:8091 is running
      [2014-04-03 13:10:51,601] - [bucket_helper:137] INFO - deleting existing buckets [] on 10.3.121.62
      [2014-04-03 13:10:51,623] - [cluster_helper:253] INFO - rebalancing all nodes in order to remove nodes
      [2014-04-03 13:10:51,632] - [rest_client:986] INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.3.4.177%2Cns_1%4010.3.3.208&user=Administrator&knownNodes=ns_1%4010.3.121.62%2Cns_1%4010.3.4.177%2Cns_1%4010.3.3.208
      [2014-04-03 13:10:51,638] - [rest_client:702] ERROR - http://10.3.121.62:8091/controller/rebalance error 400 reason: unknown

      {"mismatch":1}

      [2014-04-03 13:10:51,639] - [rest_client:992] ERROR - rebalance operation failed:

      {"mismatch":1}

      [2014-04-03 13:10:51,639] - [xdcrbasetests:124] ERROR -
      [2014-04-03 13:10:51,639] - [xdcrbasetests:125] ERROR - Error while setting up clusters: (<class 'membase.api.exception.InvalidArgumentException'>, InvalidArgumentException(), <traceback object at 0x3283128>)
      [2014-04-03 13:10:51,639] - [xdcrbasetests:134] INFO - ============== XDCRbasetests stats for test #2 load_with_async_ops ==============
      Cluster instance shutdown with force
      [2014-04-03 13:10:51,640] - [xdcrbasetests:452] INFO - Error while cleaning broken setup.
      ERROR

      ======================================================================
      ERROR: load_with_async_ops (xdcr.uniXDCR.unidirectional)
      ----------------------------------------------------------------------
      Traceback (most recent call last):
      File "pytests/xdcr/uniXDCR.py", line 15, in setUp
      super(unidirectional, self).setUp()
      File "pytests/xdcr/xdcrbasetests.py", line 94, in setUp
      self._cleanup_previous_setup()
      File "pytests/xdcr/xdcrbasetests.py", line 213, in _cleanup_previous_setup
      self._do_cleanup()
      File "pytests/xdcr/xdcrbasetests.py", line 445, in _do_cleanup
      ClusterOperationHelper.cleanup_cluster([node], self)
      File "lib/membase/helper/cluster_helper.py", line 258, in cleanup_cluster
      wait_for_rebalance=wait_for_rebalance)
      File "lib/membase/api/rest_client.py", line 87, in remove_nodes
      self.rest.rebalance(knownNodes, ejectedNodes)
      File "lib/membase/api/rest_client.py", line 995, in rebalance
      parameters=params)
      InvalidArgumentException: controller/rebalance failed when invoked with parameters: password=password&ejectedNodes=ns_1%4010.3.4.177%2Cns_1%4010.3.3.208&user=Administrator&knownNodes=ns_1%4010.3.121.62%2Cns_1%4010.3.4.177%2Cns_1%4010.3.3.208

      ----------------------------------------------------------------------
      Ran 1 test in 38.496s

      [Logs]
      10.3.121.62 : https://s3.amazonaws.com/bugdb/jira/MB-10759/f23be5af/10.3.121.62-442014-027-diag.zip
      10.3.121.65 : https://s3.amazonaws.com/bugdb/jira/MB-10759/57bb9c32/10.3.121.65-442014-023-diag.zip
      10.3.2.204 : https://s3.amazonaws.com/bugdb/jira/MB-10759/8662b9ca/10.3.2.204-442014-029-diag.zip
      10.3.3.207 : https://s3.amazonaws.com/bugdb/jira/MB-10759/a033c7ab/10.3.3.207-442014-025-diag.zip
      10.3.3.208 : https://s3.amazonaws.com/bugdb/jira/MB-10759/5db88058/10.3.3.208-442014-028-diag.zip
      10.3.3.209 : https://s3.amazonaws.com/bugdb/jira/MB-10759/397066d0/10.3.3.209-442014-024-diag.zip
      10.3.3.210 : https://s3.amazonaws.com/bugdb/jira/MB-10759/3ae8cf06/10.3.3.210-442014-027-diag.zip
      10.3.4.177 : https://s3.amazonaws.com/bugdb/jira/MB-10759/87dfe55d/10.3.4.177-442014-026-diag.zip

      Attachments

        For Gerrit Dashboard: MB-10759
        # Subject Branch Project Status CR V

        Activity

          People

            sangharsh Sangharsh Agarwal
            sangharsh Sangharsh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty