Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-11944

{DCP}:: Retry to Rebalance exited due to Bad Replicators After Kill of memcached occured in First Swap Rebalance attempt

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 3.0
    • 3.0
    • ns_server
    • Security Level: Public
    • None
    • 1:10.3.5.115
      2:10.3.5.116
      3:10.3.5.117
      4:10.3.5.118
      5:10.6.2.185
      6:10.6.2.186
      7:10.5.3.5

    Description

      3.0.0-1139, occured both in centos 6x and ubuntu 12.04. We did not see this issue in 1105 (Beta Refresh)

      Test Case:: ./testrunner -i centos.ini -t swaprebalance.SwapRebalanceFailedTests.test_failed_swap_rebalance,replica=1,num-buckets=4,num-swap=2,swap-orchestrator=True,percentage_progress=30,skip_cleanup=True,GROUP=P0

      1. Create 3 Node cluster
      2. Create 4 default bucket with 1000k items
      3. Rebalance-in 2 nodes. Rebalance-out 2 Nodes with mutations running in parallel
      4. During Rebalance in step 2 kill memcached on one of the nodes to force rebalance exit
      5. After a while restart rebalance as in Step 3

      After Step 4, we saw bad replicator message in logs, which resulted in Step 5 rebalance exit

      MESSAGE FROM LOG

      Bad replicators after rebalance:
      Missing = [

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',6}]
      Extras = []

      Event Module Code Server Node Time
      Rebalance exited with reason bad_replicas
      (repeated 1 times) ns_orchestrator002 ns_1@10.3.5.115 22:00:27 - Tue Aug 12, 2014
      Bad replicators after rebalance:
      Missing = [{'ns_1@10.6.2.185','ns_1@10.3.5.118',6}

      ]
      Extras = [] ns_rebalancer002 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
      Bucket "bucket-3" rebalance appears to be swap rebalance ns_vbucket_mover000 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
      Started rebalancing bucket bucket-3 ns_rebalancer000 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
      Starting rebalance, KeepNodes = ['ns_1@10.3.5.118','ns_1@10.3.5.116',
      'ns_1@10.6.2.185'], EjectNodes = ['ns_1@10.3.5.115',
      'ns_1@10.3.5.117'], Failed over and being ejected nodes = []; no delta recovery nodes
      ns_orchestrator004 ns_1@10.3.5.115 22:00:06 - Tue Aug 12, 2014
      Control connection to memcached on 'ns_1@10.3.5.118' disconnected: {badmatch,
      {error,
      closed}} (repeated 1 times) ns_memcached000 ns_1@10.3.5.118 21:59:51 - Tue Aug 12, 2014
      Bucket "bucket-3" loaded on node 'ns_1@10.3.5.118' in 0 seconds. (repeated 1 times) ns_memcached000 ns_1@10.3.5.118 21:59:51 - Tue Aug 12, 2014
      Bucket "bucket-2" loaded on node 'ns_1@10.3.5.118' in 0 seconds. (repeated 1 times) ns_memcached000 ns_1@10.3.5.118 21:59:51 - Tue Aug 12, 2014
      Shutting down bucket "bucket-2" on 'ns_1@10.3.5.115' for deletion ns_memcached000 ns_1@10.3.5.115 21:59:34 - Tue Aug 12, 2014
      Shutting down bucket "bucket-2" on 'ns_1@10.3.5.117' for deletion ns_memcached000 ns_1@10.3.5.117 21:59:34 - Tue Aug 12, 2014
      Rebalance exited with reason bad_replicas
      ns_orchestrator002 ns_1@10.3.5.115 21:59:34 - Tue Aug 12, 2014
      Bad replicators after rebalance:
      Missing = [

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',23}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',24}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',25}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',30}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',35}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',36}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',37}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',44}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',45}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',46}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',47}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',48}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',50}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',51}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',52}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',53}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',54}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',55}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',113}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',114}

      ,

      {'ns_1@10.3.5.116','ns_1@10.3.5.118',115}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',41}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',42}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',43}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',49}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',67}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',68}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',69}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',70}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',71}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',101}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',102}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',103}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',106}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',107}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',108}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',116}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',117}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',118}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',122}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',123}

      ,

      {'ns_1@10.3.5.118','ns_1@10.3.5.116',124}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',38}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',39}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',40}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',62}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',63}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',64}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',65}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',66}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',72}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',73}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',80}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',81}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',82}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',89}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',90}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',91}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',98}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',99}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',100}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',104}

      ,

      {'ns_1@10.3.5.118','ns_1@10.6.2.185',105}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',4}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',5}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',6}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',9}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',10}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',11}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',12}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',13}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',14}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',15}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',16}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',17}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',18}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',26}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',27}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',28}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',29}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',31}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',110}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',111}

      ,

      {'ns_1@10.6.2.185','ns_1@10.3.5.118',112}

      ]
      Extras = []

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mikew Mike Wiederhold [X] (Inactive)
            parag Parag Agarwal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty