Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58308

[BP 7.2.1] [System Test on Cloud] Rebalance stuck due to Missing defnStats

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown

    Description

      A test was run with the following configuration -

      3 KV + 2 GSI + 2 Query nodes. GSI nodes with 48core-96Gigs configuration.
      The intention was to trigger a rebalance with close to 1 TB of index data.

      Just before the rebalance, we saw these numbers -

      on node 004
      "total_data_size": 1310867520557,
      "total_disk_size": 369187015793,
       
      on node 005
      "total_data_size": 1309988584389,
      "total_disk_size": 369943068628,
      

      This comes up to 1.3 TB. The rebalance itself took a really long time ( which is quite expected given the amount of data), but then the rebalance seemed to be stuck at ~97% ( checked by running /rebalanceProgress call) for a long time. Now unfortunately, the cluster got torn down because the test itself took over 3 days ( cc Ritam Sharma we need some attention on this ASAP https://couchbasecloud.atlassian.net/browse/AV-59477).
      I could not collect the log bundles before it was torn down, but I did collect one ~12 hours before the cleanup. This is a hunch, but I don't think the cluster had become healthy (because the Capella UI still showed it as "scaling" (presumably the teardown tools don't take care of the UI component). Now this may not turn out to be a bug, but would be good to get some analysis done to make sure there's nothing really wrong with the component since we've run this large a volume on cloud for the first time.

      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-d-node-001.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-d-node-002.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-d-node-003.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-d-node-008.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-i-node-004.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-i-node-005.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-i-node-009.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-q-node-006.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/VolTestAug3/collectinfo-2023-08-05T153149-ns_1%40svc-q-node-007.vtp2vfojktbbff7q.sandbox.nonprod-project-avengers.com.zip
      

      Supportal link ->

      https://supportal.couchbase.com/customer/voltestaug3
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              dhananjay.kshirsagar Dhananjay Kshirsagar
              dhananjay.kshirsagar Dhananjay Kshirsagar
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty