Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60305

[Rebalance] : Rebalance fails with reason {mover_crashed,{unexpected_exit,{'EXIT',<0.26402.0>,{{bulk_set_vbucket_state_failed

    XMLWordPrintable

Details

    Description

      Environment

      Couchbase build Couchbase Enterprise Edition build 7.6.0-1980
      Operating System Amazon Linux release 2023 (Amazon Linux)

      Steps to repro

      1. Created a 4 node cluster with the following services
        1. ec2-54-152-208-73.compute-1.amazonaws.com - kv, n1ql
        2. ec2-3-91-12-186.compute-1.amazonaws.com - kv
        3. ec2-35-153-72-48.compute-1.amazonaws.com - cbas
        4. ec2-3-91-24-181.compute-1.amazonaws.com - cbas
      2. Created a magma bucket named 'default' and loaded a few items onto it
      3. Created dataverse, links, datasets, synonyms and indexes
      4. Removed node ec2-3-91-24-181.compute-1.amazonaws.com and ec2-54-152-208-73.compute-1.amazonaws.com
      5. Added a kv node ec2-54-92-147-57.compute-1.amazonaws.com 
      6. Added a cbas node ec2-107-22-91-165.compute-1.amazonaws.com
      7. Started a rebalance
      8. Rebalance fails

      Rebalance fails

       

      2024-01-09T02:35:56.728Z, ns_orchestrator:0:critical:message(ns_1@ec2-35-153-72-48.compute-1.amazonaws.com) - Rebalance exited with reason {mover_crashed,                              {unexpected_exit,                               {'EXIT',<0.26402.0>,                                {{bulk_set_vbucket_state_failed,                                  [{'ns_1@ec2-54-92-147-57.compute-1.amazonaws.com',                                    {'EXIT',                                     {{{{{badmatch,                                          [{<36803.18735.0>,                                            {done,exit,                                             {{{badmatch,{error,etimedout}},                                               [{dcp_proxy,connect_inner,3,                                                 [{file,"src/dcp_proxy.erl"},                                                  {line,299}]},                                                {dcp_proxy,connect,5,                                                 [{file,"src/dcp_proxy.erl"},                                                  {line,252}]},                                                {dcp_proxy,maybe_connect,2,                                                 [{file,"src/dcp_proxy.erl"},                                                  {line,235}]},                                                {dcp_producer_conn,                                                 handle_call,4,                                                 [{file,                                                   "src/dcp_producer_conn.erl"},                                                  {line,50}]},                                                {dcp_proxy,handle_call,3,                                                 [{file,"src/dcp_proxy.erl"},                                                  {line,154}]},                                                {gen_server,try_handle_call,                                                 4,                                                 [{file,"gen_server.erl"},                                                  {line,1149}]},                                                {gen_server,handle_msg,6,                                                 [{file,"gen_server.erl"},                                                  {line,1178}]},                                                {proc_lib,init_p_do_apply,3,                                                 [{file,"proc_lib.erl"},                                                  {line,240}]}]},                                              {gen_server,call,                                               [<36803.18734.0>,                                                {connect,                                                 [collections,del_times,                                                  del_user_xattr,json,                                                  set_consumer_name,snappy,                                                  ssl,xattr]},                                                infinity]}},                                             [{gen_server,call,3,                                               [{file,"gen_server.erl"},                                                {line,385}]},                                              {dcp_replicator,                                               connect_to_producer,3,                                               [{file,                                                 "src/dcp_replicator.erl"},                                                {line,76}]},                                              {dcp_replicator,                                               '-spawn_and_wait/1-fun-0-',1,                                               [{file,                                                 "src/dcp_replicator.erl"},                                                {line,323}]}]}}]},                                         [{misc,                                           sync_shutdown_many_i_am_trapping_exits,                                           1,                                           [{file,"src/misc.erl"},                                            {line,1517}]},                                          {dcp_replicator,spawn_and_wait,1,                                           [{file,"src/dcp_replicator.erl"},                                            {line,344}]},                                          {dcp_replicator,handle_info,2,                                           [{file,"src/dcp_replicator.erl"},                                            {line,137}]},                                          {gen_server,try_dispatch,4,                                           [{file,"gen_server.erl"},                                            {line,1123}]},                                          {gen_server,handle_msg,6,                                           [{file,"gen_server.erl"},                                            {line,1200}]},                                          {proc_lib,init_p_do_apply,3,                                           [{file,"proc_lib.erl"},                                            {line,240}]}]},                                        {gen_server,call,                                         [<36803.18689.0>,                                          {setup_replication,[1023]},                                          infinity]}},                                       {gen_server,call,                                        ['replication_manager-default',                                         {change_vbucket_replication,1023,                                          'ns_1@ec2-54-152-208-73.compute-1.amazonaws.com'},                                         infinity]}},                                      {gen_server,call,                                       [{'janitor_agent-default',                                         'ns_1@ec2-54-92-147-57.compute-1.amazonaws.com'},                                        {if_rebalance,<0.26357.0>,                                         {update_vbucket_state,1021,replica,                                          passive,                                          'ns_1@ec2-54-152-208-73.compute-1.amazonaws.com'}},                                        infinity]}}}}]},                                 [{janitor_agent,bulk_set_vbucket_state,4,                                   [{file,"src/janitor_agent.erl"},                                    {line,404}]},                                  {proc_lib,init_p,3,                                   [{file,"proc_lib.erl"},{line,225}]}]}}}}.Rebalance Operation Id = b60afb1eb2511460ddf73b2658e443d6 
      
      

       

       


       

      TAF Script to reproduce

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/al2023-p0-os_certify-vset00-00-analytics/testexec.6529.ini replicas=0,GROUP=P0;durability,kv_quota_percent=70,get-cbcollect-info=True,get-cbcollect-info=True,get-cbcollect-info=True,hostname=true,upgrade_version=7.6.0-1980,sirius_url=http://172.23.120.103:4000 -t cbas.cbas_collection_rebalance_failover.CBASRebalance.test_cbas_with_kv_cbas_swap_rebalance,no_of_dv=2,data_load_stage=during,upgrade_version=7.6.0-1980,kv_quota_percent=70,ds_per_dv=2,GROUP=P0;durability,hostname=true,doc_spec_name=initial_load,services_init=kv:n1ql-kv-cbas-cbas,bucket_spec=analytics.single_bucket,no_of_synonyms=2,no_of_threads=20,no_of_indexes=1,get-cbcollect-info=True,replicas=0,nodes_init=4,run_kv_queries=True,sirius_url=http://172.23.120.103:4000,num_queries=3,run_cbas_queries=True,cluster_kv_infra=bkt_spec,cbas_spec=local_datasets'

       

      Job name : al2023-os_certify_analytics

      Job ref : http://cb-logs-qe.s3-website-us-west-2.amazonaws.com/7.6.0-1980/jenkins_logs/test_suite_executor-TAF/301001/

       

      Attachments

        1. image-2024-01-24-13-19-23-106.png
          image-2024-01-24-13-19-23-106.png
          59 kB
        2. screenshot-1.png
          screenshot-1.png
          140 kB
        3. screenshot-2.png
          screenshot-2.png
          31 kB
        4. screenshot-3.png
          screenshot-3.png
          462 kB
        5. screenshot-4.png
          screenshot-4.png
          1.00 MB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              raghav.sk Raghav S K
              raghav.sk Raghav S K
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty