Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35436

Retry of failed swap rebalance fails with {{{\'EXIT\',< {{wait_seqno_persisted_failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • 6.5.0
    • couchbase-bucket
    • 6.5.0-3939
    • Triaged
    • Centos 64-bit
    • Unknown

    Description

      Script to Repro

      ./testrunner -i /tmp/testexec.3084.ini -p get-cbcollect-info=True -t swaprebalance.SwapRebalanceFailedTests.test_failed_swap_rebalance,replica=1,num-buckets=4,num-swap=2,swap-orchestrator=True,percentage_progress=30,GROUP=P0
      

      This bug is split from MB-35427 as suggested by Dave Rigby.

      Summary of the test

      1. Start a rebalance
      2. When rebalance is in when kill the memcached, rebalance fails as expected.
      3. Restart the failed rebalance.

      Expected results
      Retry of the failed rebalance should succeed.

      Actual result
      Retry of the failed rebalance fails as shown below.

      {u'node': u'ns_1@172.23.106.6', u'code': 0, u'text': u'Rebalance exited with reason {mover_crashed,\n                              {unexpected_exit,\n                               {\'EXIT\',<0.1693.36>,\n                                {{wait_seqno_persisted_failed,"bucket-2",906,\n                                  0,\n                                  [{\'ns_1@172.23.106.8\',\n                                    {\'EXIT\',\n                                     {{{child_interrupted,\n                                        {\'EXIT\',<25253.18204.4>,\n                                         socket_closed}},\n                                       [{dcp_replicator,spawn_and_wait,1,\n                                         [{file,"src/dcp_replicator.erl"},\n                                          {line,249}]},\n                                        {dcp_replicator,handle_call,3,\n                                         [{file,"src/dcp_replicator.erl"},\n                                          {line,127}]},\n                                        {gen_server,try_handle_call,4,\n                                         [{file,"gen_server.erl"},{line,636}]},\n                                        {gen_server,handle_msg,6,\n                                         [{file,"gen_server.erl"},{line,665}]},\n                                        {proc_lib,init_p_do_apply,3,\n                                         [{file,"proc_lib.erl"},{line,247}]}]},\n                                      {gen_server,call,\n                                       [{\'janitor_agent-bucket-2\',\n                                         \'ns_1@172.23.106.8\'},\n                                        {if_rebalance,<0.19815.34>,\n                                         {wait_seqno_persisted,906,0}},\n                                        infinity]}}}}]},\n                                 [{ns_single_vbucket_mover,\n                                   \'-wait_seqno_persisted_many/5-fun-2-\',5,\n                                   [{file,"src/ns_single_vbucket_mover.erl"},\n                                    {line,488}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 05371dfb9070e86ed82e49c201b45e4d', u'shortText': u'message', u'serverTime': u'2019-08-02T05:55:31.160Z', u'module': u'ns_orchestrator', u'tstamp': 1564750531160, u'type': u'critical'}
      

      These tests basically start a swap rebalance. Fails the rebalance by killing memcached and restarts the failed rebalance.

      cbcollect_info attached.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Balakumaran.Gopal Balakumaran Gopal
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty