Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45877

[windows] Rebalance failures with error buckets_shutdown_wait_failed

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • No

    Description

      7.0.0-4960

      6 GB RAM and 6 core boxes

      Test:
      ./testrunner -i /tmp/win10-gsi.ini -p get-cbcollect-info=True -t clitest.collectinfotest.CollectinfoTests.collectinfo_test,sasl_buckets=1,standard_buckets=1,GROUP=P0

      [2021-04-22 19:32:07,487] - [rest_client:1873] ERROR -

      {'status': 'none', 'errorMessage': 'Rebalance failed. See logs for detailed reason. You can try again.'}

      - rebalance failed
      [2021-04-22 19:32:10,762] - [rest_client:3804] INFO - Latest logs from UI on 172.23.106.249:
      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR - {'node': 'ns_1@172.23.106.249', 'type': 'critical', 'code': 0, 'module': 'ns_orchestrator', 'tstamp': 1619145120056, 'shortText': 'message', 'text': 'Rebalance exited with reason {buckets_shutdown_wait_failed,\n [{\'ns_1@172.23.106.249\',\n {\'EXIT\',\n

      {old_buckets_shutdown_wait_failed,\n ["standard_bucket0"]}

      }}]}.\nRebalance Operation Id = 9bc9227c38dc7da499ddf0916205a12e', 'serverTime': '2021-04-22T19:32:00.056Z'}
      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR - {'node': 'ns_1@172.23.106.249', 'type': 'critical', 'code': 0, 'module': 'ns_rebalancer', 'tstamp': 1619145120054, 'shortText': 'message', 'text': 'Failed to wait deletion of some buckets on some nodes: [{\'ns_1@172.23.106.249\',\n {\'EXIT\',\n

      {old_buckets_shutdown_wait_failed,\n ["standard_bucket0"]}

      }}]\n', 'serverTime': '2021-04-22T19:32:00.054Z'}
      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'info', 'code': 0, 'module': 'ns_orchestrator', 'tstamp': 1619145060052, 'shortText': 'message', 'text': "Starting rebalance, KeepNodes = ['ns_1@172.23.106.249'], EjectNodes = ['ns_1@172.23.136.127',\n 'ns_1@172.23.136.129',\n 'ns_1@172.23.136.252',\n 'ns_1@172.23.136.253'], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 9bc9227c38dc7da499ddf0916205a12e", 'serverTime': '2021-04-22T19:31:00.052Z'}

      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'warning', 'code': 102, 'module': 'menelaus_web', 'tstamp': 1619145060049, 'shortText': 'client-side error report', 'text': 'Client-side error-report for user "Administrator" on node \'ns_1@172.23.106.249\':\nUser-Agent:Python-httplib2/0.13.1 (gzip)\nStarting rebalance from test, ejected nodes [\'ns_1@172.23.136.127\', \'ns_1@172.23.136.129\', \'ns_1@172.23.136.252\', \'ns_1@172.23.136.253\']', 'serverTime': '2021-04-22T19:31:00.049Z'}

      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'info', 'code': 0, 'module': 'ns_memcached', 'tstamp': 1619145024855, 'shortText': 'message', 'text': 'Shutting down bucket "standard_bucket0" on \'ns_1@172.23.106.249\' for deletion', 'serverTime': '2021-04-22T19:30:24.855Z'}

      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'info', 'code': 11, 'module': 'menelaus_web', 'tstamp': 1619145018262, 'shortText': 'message', 'text': 'Deleted bucket "default"\n', 'serverTime': '2021-04-22T19:30:18.262Z'}

      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'info', 'code': 0, 'module': 'auto_failover', 'tstamp': 1619144999705, 'shortText': 'message', 'text': 'Enabled auto-failover with timeout 120 and max count 1 (repeated 1 times, last seen 13.904235 secs ago)', 'serverTime': '2021-04-22T19:29:59.705Z'}

      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'info', 'code': 0, 'module': 'ns_memcached', 'tstamp': 1619144989195, 'shortText': 'message', 'text': 'Shutting down bucket "default" on \'ns_1@172.23.106.249\' for deletion', 'serverTime': '2021-04-22T19:29:49.195Z'}

      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'info', 'code': 0, 'module': 'auto_failover', 'tstamp': 1619144982686, 'shortText': 'message', 'text': 'Enabled auto-failover with timeout 120 and max count 1', 'serverTime': '2021-04-22T19:29:42.686Z'}

      [2021-04-22 19:32:10,763] - [rest_client:3805] ERROR -

      {'node': 'ns_1@172.23.106.249', 'type': 'info', 'code': 0, 'module': 'ns_memcached', 'tstamp': 1619144940096, 'shortText': 'message', 'text': 'Shutting down bucket "bucket0" on \'ns_1@172.23.106.249\' for deletion', 'serverTime': '2021-04-22T19:29:00.096Z'}

      Cluster instance shutdown with force

      Attaching logs

      Attachments

        1. 172.23.106.249-20210422-1932-diag.zip
          12.34 MB
        2. 172.23.136.127-20210422-1932-diag.zip
          1.75 MB
        3. 172.23.136.129-20210422-1932-diag.zip
          1.86 MB
        4. 172.23.136.250-20210422-1932-diag.zip
          1.32 MB
        5. 172.23.136.252-20210422-1932-diag.zip
          1.99 MB
        6. 172.23.136.253-20210422-1932-diag.zip
          2.05 MB
        7. screenshot-1.png
          screenshot-1.png
          48 kB
        8. screenshot-2.png
          screenshot-2.png
          32 kB
        9. screenshot-3.png
          screenshot-3.png
          25 kB
        10. screenshot-4.png
          screenshot-4.png
          33 kB
        11. screenshot-5.png
          screenshot-5.png
          128 kB
        12. sys_cpu_utilization_rate.png
          sys_cpu_utilization_rate.png
          30 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              arunkumar Arunkumar Senthilnathan (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty