Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35856

Rebalance failed on resuming the purposely failed rebalance.

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      Steps to reproduce:
      1. Initialize a 4 node cluster and wait for rebalance to complete:

      +---------------+----------------------------+--------------+
      | Nodes         | Services                   | Status       |
      +---------------+----------------------------+--------------+
      | 172.23.104.73 | [u'index', u'kv', u'n1ql'] | Cluster node |
      | 172.23.104.76 | None                       | <--- IN ---  |
      | 172.23.104.80 | None                       | <--- IN ---  |
      | 172.23.104.98 | None                       | <--- IN ---  |
      +---------------+----------------------------+--------------+
      

      2. Created 3 buckets with 2 replicas as follows:

      http://172.23.104.73:8091/pools/default/buckets with param: replicaIndex=0&maxTTL=0&flushEnabled=1&compressionMode=active&bucketType=membase&name=bucket-0&replicaNumber=2&ramQuotaMB=476&threadsNumber=3&evictionPolicy=valueOnly
      2019-09-08 12:41:14,828 | infra | INFO    | pool-7-thread-10 | [BucketOperations_Rest:create_bucket:311] http://172.23.104.73:8091/pools/default/buckets with param: replicaIndex=0&maxTTL=0&flushEnabled=1&compressionMode=active&bucketType=membase&name=bucket-1&replicaNumber=2&ramQuotaMB=476&threadsNumber=3&evictionPolicy=valueOnly
      2019-09-08 12:41:14,832 | infra | INFO    | pool-7-thread-3 | [BucketOperations_Rest:create_bucket:311] http://172.23.104.73:8091/pools/default/buckets with param: replicaIndex=0&maxTTL=0&flushEnabled=1&compressionMode=active&bucketType=membase&name=bucket-2&replicaNumber=2&ramQuotaMB=476&threadsNumber=3&evictionPolicy=valueOnly
      

      3. Load 100k items in each bucket with durability=PERSIST_TO_MAJORITY

      +----------+---------+----------+-----+--------+------------+-----------+-----------+
      | Bucket   | Type    | Replicas | TTL | Items  | RAM Quota  | RAM Used  | Disk Used |
      +----------+---------+----------+-----+--------+------------+-----------+-----------+
      | bucket-0 | membase | 2        | 0   | 100000 | 1996488704 | 139323552 | 211247656 |
      | bucket-1 | membase | 2        | 0   | 100000 | 1996488704 | 138172992 | 232780333 |
      | bucket-2 | membase | 2        | 0   | 100000 | 1996488704 | 139113936 | 150520509 |
      +----------+---------+----------+-----+--------+------------+-----------+-----------+
      

      4. Swap rebalance orchestrator node: 172.23.104.73 with 172.23.105.163
      5. While rebalance is going on in step 4, load another 100k docs in each bucket separately.
      6. When rebalance is at around 60% kill memcahed on 172.23.104.73
      7. Nodes in cluster: nodes are still in cluster: [(u'172.23.104.80', 8091), (u'172.23.104.73', 8091), (u'172.23.105.163', 8091), (u'172.23.104.98', 8091), (u'172.23.104.76', 8091)]
      8. Resume rebalance and rebalance failed again immediately.

      QE Note: test_5 in consoleText logs

      -t rebalance_new.swaprebalancetests.SwapRebalanceFailedTests.test_failed_swap_rebalance,nodes_init=4,replicas=2,standard_buckets=3,num-swap=1,percentage_progress=60,GROUP=P1;durability
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty