Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-61354

[Rebalance] : Rebalance exited with reason {mover_crashed,{unexpected_exit,{'EXIT',<0.28072.2>,{{{{{child_interrupted,{'EXIT',<26454.25996.0>,socket_closed}},[{dcp_replicator,spawn_and_wait,1,[{file,"src/dcp_replicator.erl"

    XMLWordPrintable

Details

    Description

      Steps to reproduce

      1. Created 4 node kv cluster
      2. Created a couchstore bucket named default
      3. Created a few collections and loaded around 5800000 documents onto the bucket
      4. Started a workload to update all docs
      5. Attempted a swap rebalance by adding 2 nodes and removing 2 nodes

      Rebalance fails

      2024-03-28T16:53:07.954-07:00, ns_orchestrator:0:critical:message(ns_1@172.23.107.168) - Rebalance exited with reason {mover_crashed,                              {unexpected_exit,                               {'EXIT',<0.28072.2>,                                {{{{{child_interrupted,                                     {'EXIT',<26454.25996.0>,socket_closed}},                                    [{dcp_replicator,spawn_and_wait,1,                                      [{file,"src/dcp_replicator.erl"},                                       {line,358}]},                                     {dcp_replicator,handle_call,3,                                      [{file,"src/dcp_replicator.erl"},                                       {line,152}]},                                     {gen_server,try_handle_call,4,                                      [{file,"gen_server.erl"},{line,1149}]},                                     {gen_server,handle_msg,6,                                      [{file,"gen_server.erl"},{line,1178}]},                                     {proc_lib,init_p_do_apply,3,                                      [{file,"proc_lib.erl"},{line,240}]}]},                                   {gen_server,call,                                    [<26454.25994.0>,                                     {takeover,689},                                     infinity]}},                                  {gen_server,call,                                   ['replication_manager-default',                                    {change_vbucket_replication,690,undefined},                                    infinity]}},                                 {gen_server,call,                                  [{'janitor_agent-default',                                    'ns_1@172.23.217.168'},                                   {if_rebalance,<0.16117.2>,                                    {update_vbucket_state,690,active,                                     undefined,undefined,undefined}},                                   infinity]}}}}}.Rebalance Operation Id = 568dec1c3001f1309ee6dd04035f9b72 

      2024-03-28T16:53:07.575-07:00, ns_memcached:0:info:message(ns_1@172.23.217.164) - Control connection to memcached on 'ns_1@172.23.217.164' disconnected. Check logs for details.2024-03-28T16:53:07.884-07:00, ns_vbucket_mover:0:critical:message(ns_1@172.23.107.168) - Worker <0.26762.2> (for action {move,{690, ['ns_1@172.23.217.164', 'ns_1@172.23.217.166', 'ns_1@172.23.107.168'], ['ns_1@172.23.217.168', 'ns_1@172.23.217.166', 'ns_1@172.23.107.168'], []}}) exited with reason {unexpected_exit, {'EXIT', <0.28072.2>, {{{{{child_interrupted, {'EXIT', <26454.25996.0>, socket_closed}}, [{dcp_replicator, spawn_and_wait, 1, [{file, "src/dcp_replicator.erl"}, {line, 358}]}, {dcp_replicator, handle_call, 3, [{file, "src/dcp_replicator.erl"}, {line, 152}]}, {gen_server, try_handle_call, 4, [{file, "gen_server.erl"}, {line, 1149}]}, {gen_server, handle_msg, 6, [{file, "gen_server.erl"}, {line, 1178}]}, {proc_lib, init_p_do_apply, 3, [{file, "proc_lib.erl"}, {line, 240}]}]}, {gen_server, call, [<26454.25994.0>, {takeover, 689}, infinity]}}, {gen_server, call, ['replication_manager-default', {change_vbucket_replication, 690, undefined}, infinity]}}, {gen_server, call, [{'janitor_agent-default', 'ns_1@172.23.217.168'}, {if_rebalance, <0.16117.2>, {update_vbucket_state, 690, active, undefined, undefined, undefined}}, infinity]}}}}
      

       


       

       

      TAF test to reproduce

       

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/debian-p0-collections-vset00-00-rebalance_with_collection_crud_durability_MAJORITY_dgm_7.0_P1/testexec.34502.ini -p GROUP=rebalance_with_collection_crud_durability_MAJORITY_dgm,rerun=False,upgrade_version=7.2.5-7571,sirius_url=http://172.23.120.103:4000 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_swap_rebalance,data_load_stage=during,upgrade_version=7.2.5-7571,dgm=60,rerun=False,GROUP=rebalance_with_collection_crud_durability_MAJORITY_dgm,nodes_swap=2,bucket_spec=dgm.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,get-cbcollect-info=True,replicas=2,durability=MAJORITY,log_level=info,skip_validations=False,nodes_init=4,sirius_url=http://172.23.120.103:4000,override_spec_params=durability;replicas'

       

      Job name : debian-collections-rebalance_with_collection_crud_durability_MAJORITY_dgm_7.0_P1

      Job ref : http://cb-logs-qe.s3-website-us-west-2.amazonaws.com/7.2.5-7571/jenkins_logs/test_suite_executor-TAF/323453/

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            raghav.sk Raghav S K
            raghav.sk Raghav S K
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty