Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51196

[System test upgrade] - Index build stuck during rebalance due to large number of pending items

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown
    • KV 2022-Feb

    Description

      In the system upgrade tests, index rebalance is stuck. The reason for rebalance stuck seems to be due to huge number of pending documents 

      E.g., The instance idx4_8C6Jx is stuck in Moving state

       

      "name": "idx4_8C6JX",
         "bucket": "bucket6",
         "scope": "scope_0",
         "collection": "coll_2",
         "secExprs": [
          "`price`",
          "`city`",
          "`name`"
         ],
         "indexType": "plasma",
         "status": "Moving",
         "definition": "CREATE INDEX `idx4_8C6JX` ON `bucket6`.`scope_0`.`coll_2`(`price`,`city`,`name`) PARTITION BY hash((meta().`id`)) WITH { \"nodes\":[ \"172.23.120.77:18091\",\"172.23.123.26:18091\",\"172.23.123.33:18091\",\"172.23.97.105:18091\",\"172.23.97.148:18091\",\"172.23.97.149:18091\" ], \"num_replica\":2, \"num_partition\":5 }",
         "hosts": [
          "172.23.120.77:18091",
          "172.23.123.33:18091",
          "172.23.97.105:18091",
          "172.23.97.148:18091"
         ],
         "completion": 100,
         "progress": 100,
         "scheduled": false,
      

      Taking node 120.77 as example Indexer has actually moved the index instance to ACTIVE stats but due to huge number of pending documents, the rebalance of this index is not considered done

      2022-02-24T01:23:37.682-08:00 [Info] Rebalancer::waitForIndexBuild Index: bucket6:scope_0:coll_2:idx4_8C6JX State: INDEX_STATE_ACTIVE Pending: 2.7228046e+07 EstTime: 79 Partitions: [5] Destination: 127.0.0.1:9102
      2022-02-24T01:23:40.698-08:00 [Info] Rebalancer::waitForIndexBuild Index: bucket6:scope_0:coll_2:idx4_8C6JX State: INDEX_STATE_ACTIVE Pending: 2.7228046e+07 EstTime: 79 Partitions: [5] Destination: 127.0.0.1:9102
      

      Logs:
      Supportal
      http://supportal.couchbase.com/snapshot/a791810e6f6b73c2457c225d9e24a115::7

      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.106.134.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.106.137.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.106.138.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.58.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.73.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.74.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.75.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.77.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.81.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.120.86.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.121.118.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.121.77.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.25.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.26.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.32.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.123.33.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.122.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.14.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.243.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.48.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.105.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.110.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.112.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.148.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.149.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.150.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.97.74.zip
      https://cb-engineering.s3.amazonaws.com/MB-51159/collectinfo-2022-02-24T092200-ns_1%40172.23.96.254.zip

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Resolving this issue as the fix is merged to master. This MB has to be added to release notes and the fix has to be BP'ed to all active releases.

            varun.velamuri Varun Velamuri added a comment - Resolving this issue as the fix is merged to master. This MB has to be added to release notes and the fix has to be BP'ed to all active releases.

            Build couchbase-server-7.1.0-2442 contains indexing commit 18db0d2 with commit message:
            MB-51196 Prevent connection object leak by deleting closed connections

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2442 contains indexing commit 18db0d2 with commit message: MB-51196 Prevent connection object leak by deleting closed connections

            Build couchbase-server-7.2.0-1009 contains indexing commit 18db0d2 with commit message:
            MB-51196 Prevent connection object leak by deleting closed connections

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.2.0-1009 contains indexing commit 18db0d2 with commit message: MB-51196 Prevent connection object leak by deleting closed connections

            This bug can be closed once we complete the next cycle of system test upgrade.

            Balakumaran.Gopal Balakumaran Gopal added a comment - This bug can be closed once we complete the next cycle of system test upgrade.

            After completing the upgrade from 7.0.3-7031 --> 7.1.0-2491, we did a bunch of rebalances for 7 days with a special focus on indexing rebalance. All rebalances went through fine. Marking this bug closed.

            Balakumaran.Gopal Balakumaran Gopal added a comment - After completing the upgrade from 7.0.3-7031 --> 7.1.0-2491, we did a bunch of rebalances for 7 days with a special focus on indexing rebalance. All rebalances went through fine. Marking this bug closed.

            People

              Balakumaran.Gopal Balakumaran Gopal
              varun.velamuri Varun Velamuri
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty