Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46209

FTS rebalance-in appears to be hung

    XMLWordPrintable

Details

    Description

      Steps to Reproduce:
      1. Create a init cluster with 8 kv nodes and 10 fts nodes (total of 18 node init cluster)

      +----------------+----------+-----------------------+---------------+--------------+
      | Nodes          | Services | Version               | CPU           | Status       |
      +----------------+----------+-----------------------+---------------+--------------+
      | 172.23.105.175 | kv       | 7.0.0-5127-enterprise | 1.11375297209 | Cluster node |
      | 172.23.106.233 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.236 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.238 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.250 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.251 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.121.74  | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.121.78  | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.107.43  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.58  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.44  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.45  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.54  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.47  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.78  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.84  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.85  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.88  | ['fts']  |                       |               | <--- IN ---  |
      +----------------+----------+-----------------------+---------------+--------------+

      2. Create 15 Couchbase buckets, each bucket has 66 scopes, each scope has 1 collections. Each collection has 100 items

      +-------------------------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
      | Bucket                        | Type      | Replicas | Durability | TTL | Items | RAM Quota  | RAM Used  | Disk Used |
      +-------------------------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
      | 0SOXi1m-59-524000             | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140282736 | 94626452  |
      | 3UIaV1M59btKMZAuKqa-59-531000 | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140497120 | 96884102  |
      | 3_wu-59-516000                | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140269200 | 92234044  |
      | 4bbj_BagLC0c-59-435000        | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140523024 | 98157575  |
      | 8RqcdNTBrSlthPcW-59-404000    | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140438736 | 94876811  |
      | Euje6l8r-59-472000            | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140507056 | 99141383  |
      | LVPOp_6A-59-491000            | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140478208 | 99772034  |
      | N6-59-462000                  | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140304688 | 91587435  |
      | OIKAR18m1TE4vnkl-59-450000    | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140315248 | 95236594  |
      | SKx8L-59-421000               | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140357600 | 97739345  |
      | g2YWv-59-500000               | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140354816 | 92845165  |
      | m36Ir0JPz2yTE5XN--59-384000   | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140372880 | 95261388  |
      | msUJk-59-538000               | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140431504 | 99952196  |
      | r-Jwfz_CSeo5MYe%B-59-481000   | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 137281232 | 98446038  |
      | x3is_DkH8-59-508000           | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140351760 | 93582118  |
      +-------------------------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
      

      3. Create 100 fts indexes each with partitons=1 on 100 collections using rest api

      2021-05-09 22:53:36,375 | test  | INFO    | MainThread | [Collections:create_fts_indexes:335] Creating 100 fts indexes 

      the last Rest api calls return at 

      2021-05-09 22:56:45,744 | test | INFO | MainThread | [Collections:test_volume_taf:643] 

      4. Do some crud operations on documents
      5. Rebalance-in a fts node (172.23.107.107) (so almost 4 minutes from the last rest call to create fts index)

      2021-05-09 22:57:04,427 | test  | INFO    | pool-2-thread-2 | [table_view:display:72] Rebalance Overview

      This rebalance got hung at 42% for a long time .

      Hardware and RAM quota
      RAM quota for FTS: 22 GiB
      Each node CPUs: 8
      Each node RAM: ~24GiB

      CPU, RAM usage, FTS indexes building progress (is at 100%) - all look healthy 
      So not sure of why the rebalance got hung
      Screenshots and logs attached

       

      Attachments

        1. indexes.png
          indexes.png
          433 kB
        2. Screen Shot 2021-05-10 at 2.37.04 PM.png
          Screen Shot 2021-05-10 at 2.37.04 PM.png
          29 kB
        3. servers.png
          servers.png
          444 kB
        4. task.json
          11 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            sumedh.basarkod Sumedh Basarkod (Inactive)
            sumedh.basarkod Sumedh Basarkod (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty