Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60743

[Upgrade] : Rebalance exited with reason {{badmatch,failed},[{ns_rebalancer,rebalance_body,7,[{file,"src/ns_rebalancer.erl"},{line,500}]},{async,'-async_init/4-fun-1-',3,[{file,"src/async.erl"},{line,199}]}]}.

    XMLWordPrintable

Details

    Description

      Steps to reproduce

      1. Created a 5 node cluster with the following setup on Couchbase Enterprise Edition 7.1.0-2556 
        1. 172.23.121.27 - cbas
        2. 172.23.121.208 - index, kv, n1ql
        3. 172.23.123.44 - index, kv, n1ql
        4. 172.23.107.26 - cbas 
        5. 172.23.122.107 - cbas
      2. Couchstore bucket "bucket-5" was created with 10000 items
      3. Created a few dataverses, datasets, links and synonyms
      4. 172.23.107.26 was failed over
      5. Couchbase Enterprise Edition 7.6.0-2107 was installed on the node
      6. The node was added back and then attempted a rebalance

      Rebalance fails

       

      2024-02-08T22:49:59.264-08:00, ns_orchestrator:0:critical:message(ns_1@172.23.107.26) - Rebalance exited with reason {{badmatch,failed},                              [{ns_rebalancer,rebalance_body,7,                                   [{file,"src/ns_rebalancer.erl"},                                    {line,500}]},                               {async,'-async_init/4-fun-1-',3,                                   [{file,"src/async.erl"},{line,199}]}]}.Rebalance Operation Id = f76804fa78a68eecc2693126805e3344 

      Observing bad_nodes, cbas in ns_server.debug.logs 

       

       

      [ns_server:error,2024-02-08T22:49:59.253-08:00,ns_1@172.23.107.26:service_manager-cbas<0.1560.0>:service_agent:process_bad_results:990]Service call get_agent (service cbas) failed on some nodes:[{'ns_1@172.23.121.27',timeout}][error_logger:error,2024-02-08T22:49:59.261-08:00,ns_1@172.23.107.26:service_manager-cbas<0.1560.0>:ale_error_logger_handler:do_log:101]=========================CRASH REPORT=========================  crasher:    initial call: misc:'-spawn_monitor/1-fun-0-'/0    pid: <0.1560.0>    registered_name: 'service_manager-cbas'    exception error: no match of right hand side value                      {error,                         {bad_nodes,cbas,get_agent,                             [{'ns_1@172.23.121.27',timeout}]}}      in function  service_manager:wait_for_agents/1 (src/service_manager.erl, line 165)      in call from service_manager:run_op/1 (src/service_manager.erl, line 140)    ancestors: [<0.1559.0>]    message_queue_len: 0    messages: []    links: []    dictionary: []    trap_exit: false    status: running    heap_size: 2586    stack_size: 28    reductions: 5554  neighbours:
      [ns_server:debug,2024-02-08T22:49:59.261-08:00,ns_1@172.23.107.26:<0.1559.0>:service_janitor:maybe_complete_pending_failover_body:149]Failed to complete service cbas failover: {error,                                           {failover_failed,cbas,                                            {{badmatch,                                              {error,                                               {bad_nodes,cbas,get_agent,                                                [{'ns_1@172.23.121.27',                                                  timeout}]}}},                                             [{service_manager,                                               wait_for_agents,1,                                               [{file,                                                 "src/service_manager.erl"},                                                {line,165}]},                                              {service_manager,run_op,1,                                               [{file,                                                 "src/service_manager.erl"},                                                {line,140}]},                                              {proc_lib,init_p,3,                                               [{file,"proc_lib.erl"},                                                {line,225}]}]}}}[error_logger:error,2024-02-08T22:49:59.262-08:00,ns_1@172.23.107.26:logger_proxy<0.71.0>:ale_error_logger_handler:do_log:101]Error in process <0.1492.0> on node 'ns_1@172.23.107.26' with exit value:{{badmatch,failed}, [{ns_rebalancer,rebalance_body,7,[{file,"src/ns_rebalancer.erl"},{line,500}]},  {async,'-async_init/4-fun-1-',3,[{file,"src/async.erl"},{line,199}]}]}
      [ns_server:info,2024-02-08T22:49:59.262-08:00,ns_1@172.23.107.26:rebalance_agent<0.850.0>:rebalance_agent:handle_down:290]Rebalancer process <0.1492.0> died (reason {{badmatch,failed},                                            [{ns_rebalancer,rebalance_body,7,                                              [{file,"src/ns_rebalancer.erl"},                                               {line,500}]},                                             {async,'-async_init/4-fun-1-',3,                                              [{file,"src/async.erl"},                                               {line,199}]}]}).[error_logger:error,2024-02-08T22:49:59.262-08:00,ns_1@172.23.107.26:logger_proxy<0.71.0>:ale_error_logger_handler:do_log:101]Error in process <0.1490.0> on node 'ns_1@172.23.107.26' with exit value:{{badmatch,failed}, [{ns_rebalancer,rebalance_body,7,[{file,"src/ns_rebalancer.erl"},{line,500}]},  {async,'-async_init/4-fun-1-',3,[{file,"src/async.erl"},{line,199}]}]}
      [ns_server:debug,2024-02-08T22:49:59.262-08:00,ns_1@172.23.107.26:leader_activities<0.791.0>:leader_activities:handle_activity_down:457]Activity terminated with reason {shutdown,                                 {async_died,                                  {raised,                                   {error,                                    {badmatch,failed},                                    [{ns_rebalancer,rebalance_body,7,                                      [{file,"src/ns_rebalancer.erl"},                                       {line,500}]},                                     {async,'-async_init/4-fun-1-',3,                                      [{file,"src/async.erl"},                                       {line,199}]}]}}}}. Activity:{activity,<0.1491.0>,#Ref<0.4221202476.1589641217.115197>,default,          <<"a449faee6282c2daf4fc1cb52bbcdf98">>,          [rebalance],          majority,[]}[error_logger:error,2024-02-08T22:49:59.263-08:00,ns_1@172.23.107.26:<0.1488.0>:ale_error_logger_handler:do_log:101]=========================CRASH REPORT=========================  crasher:    initial call: erlang:apply/2    pid: <0.1488.0>    registered_name: []    exception error: no match of right hand side value failed      in function  ns_rebalancer:rebalance_body/7 (src/ns_rebalancer.erl, line 500)      in call from async:'-async_init/4-fun-1-'/3 (src/async.erl, line 199)    ancestors: [<0.1400.0>,ns_orchestrator_child_sup,ns_orchestrator_sup,                  mb_master_sup,mb_master,leader_registry_sup,                  leader_services_sup,<0.788.0>,ns_server_sup,                  ns_server_nodes_sup,<0.301.0>,ns_server_cluster_sup,                  root_sup,<0.155.0>]    message_queue_len: 0    messages: []    links: [<0.1400.0>]    dictionary: []    trap_exit: false    status: running    heap_size: 17731    stack_size: 28    reductions: 3358  neighbours:
      [user:error,2024-02-08T22:49:59.264-08:00,ns_1@172.23.107.26:<0.1400.0>:ns_orchestrator:log_rebalance_completion:1661]Rebalance exited with reason {{badmatch,failed},                              [{ns_rebalancer,rebalance_body,7,                                   [{file,"src/ns_rebalancer.erl"},                                    {line,500}]},                               {async,'-async_init/4-fun-1-',3,                                   [{file,"src/async.erl"},{line,199}]}]}.Rebalance Operation Id = f76804fa78a68eecc2693126805e3344 

      Re-tries of the same rebalance also fail

       

       


       

       

      TAF Script to reproduce

       

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/debian-p0-analytics-vset00-00-analytics_upgrade_from_7.1.0_with_collections/testexec.20372.ini -p GROUP=7_1_0,kv_quota_percent=70,bucket_storage=couchstore,key=test_collections,get-cbcollect-info=True,upgrade_version=7.6.0-2107,aws_access_key=AKIAXQQ2DIGA2VADROME,aws_secret_key=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,sirius_url=http://172.23.120.103:4000 -t upgrade.cbas_upgrade.UpgradeTests.test_upgrade_with_failover,upgrade_chain=7.1.0,upgrade_type=failover_delta_recovery,update_nodes=kv;cbas,nodes_init=5,services_init=kv:index:n1ql-kv:index:n1ql-cbas-cbas-cbas,pre_update_no_of_dv=2,pre_update_ds_per_dv=4,pre_update_no_of_synonym=5,pre_update_no_of_index=3,replica_num=3,override_spec_params=num_buckets;num_scopes;num_collections;replicas;num_items,num_items=10000,num_buckets=3,num_scopes=5,num_collections=5,no_of_dv=10,ds_per_dv=3,no_of_synonym=10,no_of_index=5,GROUP=7_1_0'

       

      Job name : debian-analytics_upgrade_from_7.1.0_with_collections

       

      Job ref : http://qa.sc.couchbase.com/job/test_suite_executor-TAF/309910/console

      Attachments

        Issue Links

          Activity

            People

              michael.blow Michael Blow
              raghav.sk Raghav S K
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                PagerDuty