Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51682

[BP to 7.0.4] - [System test upgrade] - Index build stuck during rebalance due to large number of pending items

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      In the system upgrade tests, index rebalance is stuck. The reason for rebalance stuck seems to be due to huge number of pending documents 

      E.g., The instance idx4_8C6Jx is stuck in Moving state

       

      "name": "idx4_8C6JX",
         "bucket": "bucket6",
         "scope": "scope_0",
         "collection": "coll_2",
         "secExprs": [
          "`price`",
          "`city`",
          "`name`"
         ],
         "indexType": "plasma",
         "status": "Moving",
         "definition": "CREATE INDEX `idx4_8C6JX` ON `bucket6`.`scope_0`.`coll_2`(`price`,`city`,`name`) PARTITION BY hash((meta().`id`)) WITH { \"nodes\":[ \"172.23.120.77:18091\",\"172.23.123.26:18091\",\"172.23.123.33:18091\",\"172.23.97.105:18091\",\"172.23.97.148:18091\",\"172.23.97.149:18091\" ], \"num_replica\":2, \"num_partition\":5 }",
         "hosts": [
          "172.23.120.77:18091",
          "172.23.123.33:18091",
          "172.23.97.105:18091",
          "172.23.97.148:18091"
         ],
         "completion": 100,
         "progress": 100,
         "scheduled": false,
      

      Taking node 120.77 as example Indexer has actually moved the index instance to ACTIVE stats but due to huge number of pending documents, the rebalance of this index is not considered done

      2022-02-24T01:23:37.682-08:00 [Info] Rebalancer::waitForIndexBuild Index: bucket6:scope_0:coll_2:idx4_8C6JX State: INDEX_STATE_ACTIVE Pending: 2.7228046e+07 EstTime: 79 Partitions: [5] Destination: 127.0.0.1:9102
      2022-02-24T01:23:40.698-08:00 [Info] Rebalancer::waitForIndexBuild Index: bucket6:scope_0:coll_2:idx4_8C6JX State: INDEX_STATE_ACTIVE Pending: 2.7228046e+07 EstTime: 79 Partitions: [5] Destination: 127.0.0.1:9102
      

      Logs:
      Supportal
      http://supportal.couchbase.com/snapshot/a791810e6f6b73c2457c225d9e24a115::7

      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.106.134.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.106.137.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.106.138.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.58.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.73.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.74.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.75.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.77.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.81.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.86.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.121.118.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.121.77.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.25.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.26.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.32.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.33.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.122.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.14.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.243.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.48.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.105.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.110.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.112.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.148.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.149.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.150.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.74.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.254.zip

      Attachments

        1. before.pdf
          129 kB
        2. file_after_scan.pdf
          138 kB
        3. new.pdf
          101 kB
        4. projector_mem.pprof
          38 kB
        5. script.go
          0.3 kB

        Issue Links

          For Gerrit Dashboard: MB-51682
          # Subject Branch Project Status CR V

          Activity

            People

              hemant.rajput Hemant Rajput
              varun.velamuri Varun Velamuri
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty