Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-59385

Swap rebalance stuck post shard corruption and recovery

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown

    Description

      The test steps are -

      Create a 4 node cluster (1 KV + 3 GSI/Query)
      Create buckets/scopes/collections and indexes on these buckets.
      Induce corruption by injecting error into one of the shards( Shard chosen in this case is shard7188432582549650222).
      Trigger a swap rebalance by bringing in 2 Query nodes to remove 2 GSI/Query nodes.
      The rebalance fails because of corruption in the shard as expected.
      Restart the indexer and re-trigger rebalance.

      Rebalance seems to be stuck.

      {
          "status": "running",
          "ns_1@10.113.223.101": {
              "progress": 1
          },
          "ns_1@10.113.223.102": {
              "progress": 0.3411190476190476
          },
          "ns_1@10.113.223.103": {
              "progress": 0.3411190476190476
          },
          "ns_1@10.113.223.104": {
              "progress": 0.3411190476190476
          },
          "ns_1@10.113.223.105": {
              "progress": 0.6822380952380952
          },
          "ns_1@10.113.223.106": {
              "progress": 0.6822380952380952
          }
      }
      

      s3://cb-customers-secure/rebalancestuckpostplasmacorruption/2023-10-31/rebalance_stuck.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            pavan.pb Pavan PB
            pavan.pb Pavan PB
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty