Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46209

FTS rebalance-in appears to be hung

    XMLWordPrintable

Details

    Description

      Steps to Reproduce:
      1. Create a init cluster with 8 kv nodes and 10 fts nodes (total of 18 node init cluster)

      +----------------+----------+-----------------------+---------------+--------------+
      | Nodes          | Services | Version               | CPU           | Status       |
      +----------------+----------+-----------------------+---------------+--------------+
      | 172.23.105.175 | kv       | 7.0.0-5127-enterprise | 1.11375297209 | Cluster node |
      | 172.23.106.233 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.236 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.238 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.250 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.106.251 | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.121.74  | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.121.78  | ['kv']   |                       |               | <--- IN ---  |
      | 172.23.107.43  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.58  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.44  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.45  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.54  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.47  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.78  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.84  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.85  | ['fts']  |                       |               | <--- IN ---  |
      | 172.23.107.88  | ['fts']  |                       |               | <--- IN ---  |
      +----------------+----------+-----------------------+---------------+--------------+

      2. Create 15 Couchbase buckets, each bucket has 66 scopes, each scope has 1 collections. Each collection has 100 items

      +-------------------------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
      | Bucket                        | Type      | Replicas | Durability | TTL | Items | RAM Quota  | RAM Used  | Disk Used |
      +-------------------------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
      | 0SOXi1m-59-524000             | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140282736 | 94626452  |
      | 3UIaV1M59btKMZAuKqa-59-531000 | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140497120 | 96884102  |
      | 3_wu-59-516000                | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140269200 | 92234044  |
      | 4bbj_BagLC0c-59-435000        | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140523024 | 98157575  |
      | 8RqcdNTBrSlthPcW-59-404000    | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140438736 | 94876811  |
      | Euje6l8r-59-472000            | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140507056 | 99141383  |
      | LVPOp_6A-59-491000            | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140478208 | 99772034  |
      | N6-59-462000                  | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140304688 | 91587435  |
      | OIKAR18m1TE4vnkl-59-450000    | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140315248 | 95236594  |
      | SKx8L-59-421000               | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140357600 | 97739345  |
      | g2YWv-59-500000               | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140354816 | 92845165  |
      | m36Ir0JPz2yTE5XN--59-384000   | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140372880 | 95261388  |
      | msUJk-59-538000               | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140431504 | 99952196  |
      | r-Jwfz_CSeo5MYe%B-59-481000   | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 137281232 | 98446038  |
      | x3is_DkH8-59-508000           | couchbase | 1        | none       | 0   | 6600  | 2516582400 | 140351760 | 93582118  |
      +-------------------------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
      

      3. Create 100 fts indexes each with partitons=1 on 100 collections using rest api

      2021-05-09 22:53:36,375 | test  | INFO    | MainThread | [Collections:create_fts_indexes:335] Creating 100 fts indexes 

      the last Rest api calls return at 

      2021-05-09 22:56:45,744 | test | INFO | MainThread | [Collections:test_volume_taf:643] 

      4. Do some crud operations on documents
      5. Rebalance-in a fts node (172.23.107.107) (so almost 4 minutes from the last rest call to create fts index)

      2021-05-09 22:57:04,427 | test  | INFO    | pool-2-thread-2 | [table_view:display:72] Rebalance Overview

      This rebalance got hung at 42% for a long time .

      Hardware and RAM quota
      RAM quota for FTS: 22 GiB
      Each node CPUs: 8
      Each node RAM: ~24GiB

      CPU, RAM usage, FTS indexes building progress (is at 100%) - all look healthy 
      So not sure of why the rebalance got hung
      Screenshots and logs attached

       

      Attachments

        1. servers.png
          servers.png
          444 kB
        2. indexes.png
          indexes.png
          433 kB
        3. Screen Shot 2021-05-10 at 2.37.04 PM.png
          Screen Shot 2021-05-10 at 2.37.04 PM.png
          29 kB
        4. task.json
          11 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            sumedh.basarkod Sumedh Basarkod (Inactive)
            sumedh.basarkod Sumedh Basarkod (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty