Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-52499

[Magma] Rebalance exited with reason old_buckets_shutdown_wait_failed

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • Unknown

    Description

      Seeing this rebalance error in functional tests.
      Test steps:
      1) Source cluster at 1% dgm
      2) Setup xdcr replication default bucket -> default bucket to target cluster
      3) Rebalance in node to source cluster

      Source master node config:

      [2022-06-06 22:45:10,288] - [task:159] INFO -  {'uptime': '7308', 'memoryTotal': 12294172672, 'memoryFree': 10550018048, 'mcdMemoryReserved': 9379, 'mcdMemoryAllocated': 9379, 'status': 'healthy', 'hostname': '172.23.106.204:8091', 'clusterCompatibility': 458753, 'clusterMembership': 'active', 'recoveryType': 'none', 'version': '7.1.1-3113-enterprise', 'os': 'x86_64-pc-linux-gnu', 'ports': [], 'availableStorage': [], 'storage': [<membase.api.rest_client.NodeDataStorage object at 0x7fef10201e10>], 'memoryQuota': 6283, 'memcached': 11210, 'id': 'ns_1@172.23.106.204', 'ip': '172.23.106.204', 'internal_ip': '', 'rest_username': '', 'rest_password': '', 'port': '8091', 'services': ['kv'], 'storageTotalRam': 11724, 'curr_items': 0}
      

      Bucket default size 1024 MB
      Current resident ratio: 0.1017293997965412, desired: 1 bucket default
      Doc size=1000000 bytes, Number of docs=1506

      Starting rebalance-in nodes:[ip:172.23.107.1 port:8091 ssh_username:root] at C1 cluster 172.23.106.204

      While tearing down clusters after the test, rebalance fails:

      [rebalance:error,2022-06-07T02:34:16.293-07:00,ns_1@172.23.106.204:<0.22042.31>:ns_rebalancer:do_wait_buckets_shutdown:194]Failed to wait deletion of some buckets on some nodes: [{'ns_1@172.23.106.204',
                                                               {'EXIT',
                                                                {old_buckets_shutdown_wait_failed,
                                                                 ["default"]}}}]
       
      [user:error,2022-06-07T02:34:16.294-07:00,ns_1@172.23.106.204:<0.19170.1>:ns_orchestrator:log_rebalance_completion:1428]Rebalance exited with reason {buckets_shutdown_wait_failed,
                                    [{'ns_1@172.23.106.204',
                                      {'EXIT',
                                       {old_buckets_shutdown_wait_failed,
                                        ["default"]}}}]}.
      

      Attachments

        1. test_2.zip
          64.07 MB
        2. Screenshot 2022-06-13 at 10.58.43 AM.png
          Screenshot 2022-06-13 at 10.58.43 AM.png
          45 kB
        3. mem used.png
          mem used.png
          166 kB
        4. domain memory usage.png
          domain memory usage.png
          181 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              pavithra.mahamani Pavithra Mahamani (Inactive)
              pavithra.mahamani Pavithra Mahamani (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty