Description
Issue observed on 7.6.0-2090
Steps to repro
- Have a 8 node cluster with kv:n1ql-kv-kv-index:n1ql-index:n1ql-fts-fts-fts
- Fts and indexing service ram should be 10k mb
- Create 4 buckets and load 1k to 2k into each one of them
- Create 50 gsi indexes in each of the buckets with defer build as true and num_replica as 1
- Build the above gsi indexes
- Create 10 fts indexes in each bucket with num_replica=1, num_partitions=1, index_type=scorch, text_analyzer=keyword
- Create 100 gsi indexes in each of the buckets with defer build as true and num_replica as 1 and then drop them
- Create 12 fts indexes in each bucket with num_replica=1, num_partitions=1, index_type=scorch, text_analyzer=keyword and then drop them
- Now rebalance out the first fts node in the cluster
Rebalance failure -
2024-02-09 04:43:24 | INFO | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] Latest logs from UI on 172.23.123.131: |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'critical', 'code': 0, 'module': 'ns_orchestrator', 'tstamp': 1707482596357, 'shortText': 'message', 'text': 'Rebalance exited with reason {service_rebalance_failed,fts,\n {agent_died,<27302.12262.0>,\n {linked_process_died,<27302.12332.0>,\n {\'ns_1@172.23.122.132\',\n {{badmatch,\n {false,\n {topology,[],\n [<<"4bf653ebc93af32ca7592c04ffa86d4d">>,\n <<"4e730b3e98d67da6187b4ee8af1229d8">>,\n <<"6520c388f6e44a26a8d59502fec7b3b3">>],\n true,[]},\n {topology,[],\n [<<"4bf653ebc93af32ca7592c04ffa86d4d">>,\n <<"4e730b3e98d67da6187b4ee8af1229d8">>,\n <<"6520c388f6e44a26a8d59502fec7b3b3">>],\n false,[]}}},\n [{service_agent,long_poll_worker_loop,5,\n [{file,"src/service_agent.erl"},\n {line,750}]},\n {proc_lib,init_p,3,\n [{file,"proc_lib.erl"},{line,225}]}]}}}}}.\nRebalance Operation Id = ddc3b2a03ae139e1923f4bbe8a23648e', 'serverTime': '2024-02-09T04:43:16.357Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_vbucket_mover', 'tstamp': 1707482561108, 'shortText': 'message', 'text': 'Bucket "standard_bucket3" rebalance does not seem to be swap rebalance', 'serverTime': '2024-02-09T04:42:41.108Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_rebalancer', 'tstamp': 1707482560563, 'shortText': 'message', 'text': 'Started rebalancing bucket standard_bucket3', 'serverTime': '2024-02-09T04:42:40.563Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_vbucket_mover', 'tstamp': 1707482551586, 'shortText': 'message', 'text': 'Bucket "standard_bucket2" rebalance does not seem to be swap rebalance', 'serverTime': '2024-02-09T04:42:31.586Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_rebalancer', 'tstamp': 1707482551058, 'shortText': 'message', 'text': 'Started rebalancing bucket standard_bucket2', 'serverTime': '2024-02-09T04:42:31.058Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_vbucket_mover', 'tstamp': 1707482543019, 'shortText': 'message', 'text': 'Bucket "standard_bucket1" rebalance does not seem to be swap rebalance', 'serverTime': '2024-02-09T04:42:23.019Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_rebalancer', 'tstamp': 1707482542496, 'shortText': 'message', 'text': 'Started rebalancing bucket standard_bucket1', 'serverTime': '2024-02-09T04:42:22.496Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_vbucket_mover', 'tstamp': 1707482534662, 'shortText': 'message', 'text': 'Bucket "standard_bucket0" rebalance does not seem to be swap rebalance', 'serverTime': '2024-02-09T04:42:14.662Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_rebalancer', 'tstamp': 1707482534116, 'shortText': 'message', 'text': 'Started rebalancing bucket standard_bucket0', 'serverTime': '2024-02-09T04:42:14.116Z'} |
2024-02-09 04:43:24 | ERROR | MainProcess | Cluster_Thread | [on_prem_rest_client.print_UI_logs] {'node': 'ns_1@172.23.123.157', 'type': 'info', 'code': 0, 'module': 'ns_orchestrator', 'tstamp': 1707482533785, 'shortText': 'message', 'text': "Starting rebalance, KeepNodes = ['ns_1@172.23.123.129','ns_1@172.23.123.131',\n 'ns_1@172.23.123.157','ns_1@172.23.123.160',\n 'ns_1@172.23.123.206','ns_1@172.23.123.207',\n 'ns_1@172.23.123.209'], EjectNodes = ['ns_1@172.23.122.132'], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = ddc3b2a03ae139e1923f4bbe8a23648e", 'serverTime': '2024-02-09T04:42:13.785Z'} |
Node being rebalanced out - 172.23.122.132
Logs -
test_1 (9).zip
Test logs -
test_log.log
Attachments
Issue Links
- duplicates
-
MB-60720 TLS: Rebalance 2 nodes during index build: service_rebalance_failed,fts agent_died, badmatch
- Closed