Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60667

Rebalance failed with "old_indexes_cleanup_failed"

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Yes

    Description

      Rebalance failed with error 

       

      024-02-01 19:59:43 | INFO | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] Latest logs from UI on 172.23.123.207:
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'critical', 'code': 0, 'module': 'ns_orchestrator', 'tstamp': 1706846373819, 'shortText': 'message', 'text': 'Rebalance exited with reason {{badmatch,\n                               {old_indexes_cleanup_failed,\n                                [{\'ns_1@172.23.123.206\',{error,eexist}}]}},\n                              [{ns_rebalancer,rebalance_body,7,\n                                [{file,"src/ns_rebalancer.erl"},{line,470}]},\n                               {async,\'-async_init/4-fun-1-\',3,\n                                [{file,"src/async.erl"},{line,199}]}]}.\nRebalance Operation Id = 059c690a5efe9ce1929858e303f61b32', 'serverTime': '2024-02-01T19:59:33.819Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'critical', 'code': 0, 'module': 'ns_rebalancer', 'tstamp': 1706846373789, 'shortText': 'message', 'text': "Failed to cleanup indexes: [{'ns_1@172.23.123.206',{error,eexist}}]", 'serverTime': '2024-02-01T19:59:33.789Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_orchestrator', 'tstamp': 1706846373773, 'shortText': 'message', 'text': "Starting rebalance, KeepNodes = ['ns_1@172.23.123.157','ns_1@172.23.123.206',\n                                 'ns_1@172.23.123.207'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 059c690a5efe9ce1929858e303f61b32", 'serverTime': '2024-02-01T19:59:33.773Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'auto_failover', 'tstamp': 1706846373651, 'shortText': 'message', 'text': 'Enabled auto-failover with timeout 120 and max count 1', 'serverTime': '2024-02-01T19:59:33.651Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'mb_master', 'tstamp': 1706846373647, 'shortText': 'message', 'text': "Haven't heard from a higher priority node or a master, so I'm taking over.", 'serverTime': '2024-02-01T19:59:33.647Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'memcached_config_mgr', 'tstamp': 1706846363832, 'shortText': 'message', 'text': 'Hot-reloaded memcached.json for config change of the following keys: [<<"scramsha_fallback_salt">>]', 'serverTime': '2024-02-01T19:59:23.832Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 3, 'module': 'ns_cluster', 'tstamp': 1706846363646, 'shortText': 'message', 'text': 'Node ns_1@172.23.123.157 joined cluster', 'serverTime': '2024-02-01T19:59:23.646Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'warning', 'code': 0, 'module': 'mb_master', 'tstamp': 1706846363634, 'shortText': 'message', 'text': "Current master is strongly lower priority and I'll try to takeover", 'serverTime': '2024-02-01T19:59:23.634Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 1, 'module': 'menelaus_web_sup', 'tstamp': 1706846363609, 'shortText': 'web start ok', 'text': 'Couchbase Server has started on web port 8091 on node \'ns_1@172.23.123.157\'. Version: "7.6.0-2090-enterprise".', 'serverTime': '2024-02-01T19:59:23.609Z'}
      2024-02-01 19:59:43 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.206', 'type': 'info', 'code': 4, 'module': 'ns_node_disco', 'tstamp': 1706846360739, 'shortText': 'node up', 'text': "Node 'ns_1@172.23.123.206' saw that node 'ns_1@172.23.123.157' came up. Tags: []", 'serverTime': '2024-02-01T19:59:20.739Z'}
      [<FrameSummary file /usr/local/lib/python3.7/threading.py, line 890 in _bootstrap>, <FrameSummary file /usr/local/lib/python3.7/threading.py, line 926 in _bootstrap_inner>, <FrameSummary file lib/tasks/taskmanager.py, line 34 in run>, <FrameSummary file lib/tasks/task.py, line 113 in step>, <FrameSummary file lib/tasks/task.py, line 910 in check>, <FrameSummary file lib/tasks/future.py, line 265 in set_exception>] 

       

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            hemant.rajput Hemant Rajput
            hemant.rajput Hemant Rajput
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty