Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-41336

[Collections] - Multi node swap rebalance + collections CRUD + Durability fails

    XMLWordPrintable

Details

    Description

      Script to repro

      ./testrunner -i /tmp/testexec.30136.ini GROUP=rebalance_with_collection_crud_durability_MAJORITY,rerun=False,upgrade_version=7.0.0-3016 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_swap_rebalance,data_load_stage=during,quota_percent=80,upgrade_version=7.0.0-3016,rerun=False,GROUP=rebalance_with_collection_crud_durability_MAJORITY,nodes_swap=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,get-cbcollect-info=True,replicas=2,durability=MAJORITY,log_level=error,nodes_init=4,override_spec_params=durability;replicas,infra_log_level=critical
      

      Its basically a multi node swap rebalance + collections CRUD + durability level majority which fails as shown below.

      Seen on 172.23.105.234

      2020-09-07 04:52:24,286 | test  | ERROR   | pool-1-thread-20 | [rest_client:print_UI_logs:2537] {u'code': 0, u'module': u'ns_orchestrator', u'type': u'critical', u'node': u'ns_1@172.23.105.234', u'tstamp': 1599479543353L, u'shortText': u'message', u'serverTime': u'2020-09-07T04:52:23.353Z', u'text': u'Rebalance exited with reason {mover_crashed,\n                              {unexpected_exit,\n                               {\'EXIT\',<0.6425.8>,\n                                {{{{{child_interrupted,\n                                     {\'EXIT\',<25829.4023.0>,socket_closed}},\n                                    [{dcp_replicator,spawn_and_wait,1,\n                                      [{file,"src/dcp_replicator.erl"},\n                                       {line,265}]},\n                                     {dcp_replicator,handle_call,3,\n                                      [{file,"src/dcp_replicator.erl"},\n                                       {line,121}]},\n                                     {gen_server,try_handle_call,4,\n                                      [{file,"gen_server.erl"},{line,661}]},\n                                     {gen_server,handle_msg,6,\n                                      [{file,"gen_server.erl"},{line,690}]},\n                                     {proc_lib,init_p_do_apply,3,\n                                      [{file,"proc_lib.erl"},{line,249}]}]},\n                                   {gen_server,call,\n                                    [<25829.4021.0>,\n                                     {setup_replication,\n                                      [383,384,385,386,387,388,389,390,391,\n                                       392,393,394,395,396,397,398,399,400,\n                                       401,402,403,404,405,406,407,408,409,\n                                       410,411,412,413,414,415,416,417,418,\n                                       419,420,421,422,423,424,425,426,427,\n                                       428,473,474,475,476,477,478,479]},\n                                     infinity]}},\n                                  {gen_server,call,\n                                   [\'replication_manager-default\',\n                                    {change_vbucket_replication,480,undefined},\n                                    infinity]}},\n                                 {gen_server,call,\n                                  [{\'janitor_agent-default\',\n                                    \'ns_1@172.23.105.34\'},\n                                   {if_rebalance,<0.4004.6>,\n                                    {update_vbucket_state,720,active,paused,\n                                     undefined,\n                                     [[\'ns_1@172.23.105.34\',\n                                       \'ns_1@172.23.106.47\',\n                                       \'ns_1@172.23.105.234\']]}},\n                                   infinity]}}}}}.\nRebalance Operation Id = e62254fd737e2d08641aee48c5bb8bfb'}
      2020-09-07 04:52:24,289 | test  | ERROR   | pool-1-thread-20 | [rest_client:print_UI_logs:2537] {u'code': 0, u'module': u'ns_vbucket_mover', u'type': u'critical', u'node': u'ns_1@172.23.105.234', u'tstamp': 1599479543160L, u'shortText': u'message', u'serverTime': u'2020-09-07T04:52:23.160Z', u'text': u'Worker <0.5985.8> (for action {move,{720,\n                                     [\'ns_1@172.23.105.34\',\n                                      \'ns_1@172.23.106.47\',\n                                      \'ns_1@172.23.105.234\'],\n                                     [\'ns_1@172.23.106.48\',\n                                      \'ns_1@172.23.97.219\',\n                                      \'ns_1@172.23.105.234\'],\n                                     []}}) exited with reason {unexpected_exit,\n                                                               {\'EXIT\',\n                                                                <0.6425.8>,\n                                                                {{{{{child_interrupted,\n                                                                     {\'EXIT\',\n                                                                      <25829.4023.0>,\n                                                                      socket_closed}},\n                                                                    [{dcp_replicator,\n                                                                      spawn_and_wait,\n                                                                      1,\n                                                                      [{file,\n                                                                        "src/dcp_replicator.erl"},\n                                                                       {line,\n                                                                        265}]},\n                                                                     {dcp_replicator,\n                                                                      handle_call,\n                                                                      3,\n                                                                      [{file,\n                                                                        "src/dcp_replicator.erl"},\n                                                                       {line,\n                                                                        121}]},\n                                                                     {gen_server,\n                                                                      try_handle_call,\n                                                                      4,\n                                                                      [{file,\n                                                                        "gen_server.erl"},\n                                                                       {line,\n                                                                        661}]},\n                                                                     {gen_server,\n                                                                      handle_msg,\n                                                                      6,\n                                                                      [{file,\n                                                                        "gen_server.erl"},\n                                                                       {line,\n                                                                        690}]},\n                                                                     {proc_lib,\n                                                                      init_p_do_apply,\n                                                                      3,\n                                                                      [{file,\n                                                                        "proc_lib.erl"},\n                                                                       {line,\n                                                                        249}]}]},\n                                                                   {gen_server,\n                                                                    call,\n                                                                    [<25829.4021.0>,\n                                                                     {setup_replication,\n                                                                      [383,\n                                                                       384,\n                                                                       385,\n                                                                       386,\n                                                                       387,\n                                                                       388,\n                                                                       389,\n                                                                       390,\n                                                                       391,\n                                                                       392,\n                                                                       393,\n                                                                       394,\n                                                                       395,\n                                                                       396,\n                                                                       397,\n                                                                       398,\n                                                                       399,\n                                                                       400,\n                                                                       401,\n                                                                       402,\n                                                                       403,\n                                                                       404,\n                                                                       405,\n                                                                       406,\n                                                                       407,\n                                                                       408,\n                                                                       409,\n                                                                       410,\n                                                                       411,\n                                                                       412,\n                                                                       413,\n                                                                       414,\n                                                                       415,\n                                                                       416,\n                                                                       417,\n                                                                       418,\n                                                                       419,\n                                                                       420,\n                                                                       421,\n                                                                       422,\n                                                                       423,\n                                                                       424,\n                                                                       425,\n                                                                       426,\n                                                                       427,\n                                                                       428,\n                                                                       473,\n                                                                       474,\n                                                                       475,\n                                                                       476,\n                                                                       477,\n                                                                       478,\n                                                                       479]},\n                                                                     infinity]}},\n                                                                  {gen_server,\n                                                                   call,\n                                                                   [\'replication_manager-default\',\n                                                                    {change_vbucket_replication,\n                                                                     480,\n                                                                     undefined},\n                                                                    infinity]}},\n                                                                 {gen_server,\n                                                                  call,\n                                                                  [{\'janitor_agent-default\',\n                                                                    \'ns_1@172.23.105.34\'},\n                                                                   {if_rebalance,\n                                                                    <0.4004.6>,\n                                                                    {update_vbucket_state,\n                                                                     720,\n                                                                     active,\n                                                                     paused,\n                                                                     undefined,\n                                                                     [[\'ns_1@172.23.105.34\',\n                                                                       \'ns_1@172.23.106.47\',\n                                                                       \'ns_1@172.23.105.234\']]}},\n                                                                   infinity]}}}}'}
      

      I have not attached detailed steps as the weekly run was done with lower verbose log level and it has only failures pasted above and subsequent repro's did not yield this rebalance failure. However since I am adding a supportal linik, hopefully it should give enough information to do debugging.

      This test did pass onĀ 7.0.0-2908. However it does look like its not consistently reproducible.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              richard.demellow Richard deMellow
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty