Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-11819

XDCR: Rebalance at destination hangs, missing replica items

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      Build
      -------
      3.0.0-1014

      Scenario
      ------------
      1. Uni-xdcr between 2-node clusters, default bucket
      2. Load 30K items on source
      3. Pause XDCR
      4. Start "rebalance-out" of one node each from both clusters simultaneously.
      5. Resume xdcr

      Rebalance at source proceeds to completion, rebalance on dest hangs at 10%, see -

      ',default bucket
      [2014-07-24 13:27:05,728] - [xdcrbasetests:642] INFO - Starting rebalance-out nodes:['172.23.106.46'] at cluster 172.23.106.45
      [2014-07-24 13:27:05,760] - [xdcrbasetests:642] INFO - Starting rebalance-out nodes:['172.23.106.48'] at cluster 172.23.106.47
      [2014-07-24 13:27:06,806] - [rest_client:1216] INFO - rebalance percentage : 0 %
      [2014-07-24 13:27:06,816] - [rest_client:1216] INFO - rebalance percentage : 0 %
      [2014-07-24 13:27:13,183] - [pauseResumeXDCR:331] INFO - Waiting for rebalance to complete...
      [2014-07-24 13:27:17,174] - [rest_client:1216] INFO - rebalance percentage : 24.21875 %
      [2014-07-24 13:27:17,181] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:27:27,201] - [rest_client:1216] INFO - rebalance percentage : 33.59375 %
      [2014-07-24 13:27:27,207] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:27:37,233] - [rest_client:1216] INFO - rebalance percentage : 41.9921875 %
      [2014-07-24 13:27:37,242] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:27:47,263] - [rest_client:1216] INFO - rebalance percentage : 53.90625 %
      [2014-07-24 13:27:47,272] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:27:57,294] - [rest_client:1216] INFO - rebalance percentage : 60.8723958333 %
      [2014-07-24 13:27:57,304] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:28:07,325] - [rest_client:1216] INFO - rebalance percentage : 100 %
      [2014-07-24 13:28:30,222] - [task:411] INFO - rebalancing was completed with progress: 100% in 83.475001812 sec
      [2014-07-24 13:28:30,223] - [pauseResumeXDCR:331] INFO - Waiting for rebalance to complete...
      [2014-07-24 13:28:30,229] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:28:40,252] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:28:50,280] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:29:00,301] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:29:10,342] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:29:20,363] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:29:30,389] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:29:40,410] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:29:50,437] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:30:00,458] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:30:10,480] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:30:20,504] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:30:30,523] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:30:40,546] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %
      [2014-07-24 13:30:50,569] - [rest_client:1216] INFO - rebalance percentage : 10.0911458333 %

      Testcase
      --------------
      ./testrunner -i uni-xdcr.ini -t xdcr.pauseResumeXDCR.PauseResumeTest.replication_with_pause_and_resume,items=30000,rdirection=unidirection,ctopology=chain,replication_type=xmem,rebalance_out=source-destination,pause=source,GROUP=P1

      The rebalance hang to explain the missing replica items?

      [2014-07-24 13:31:49,079] - [task:463] INFO - Saw curr_items 30000 == 30000 expected on '172.23.106.47:8091''172.23.106.48:8091',default bucket
      [2014-07-24 13:31:49,103] - [data_helper:289] INFO - creating direct client 172.23.106.47:11210 default
      [2014-07-24 13:31:49,343] - [data_helper:289] INFO - creating direct client 172.23.106.48:11210 default
      [2014-07-24 13:31:49,536] - [task:463] INFO - Saw vb_active_curr_items 30000 == 30000 expected on '172.23.106.47:8091''172.23.106.48:8091',default bucket
      [2014-07-24 13:31:49,559] - [data_helper:289] INFO - creating direct client 172.23.106.47:11210 default
      [2014-07-24 13:31:49,811] - [data_helper:289] INFO - creating direct client 172.23.106.48:11210 default
      [2014-07-24 13:31:50,001] - [task:459] WARNING - Not Ready: vb_replica_curr_items 27700 == 30000 expected on '172.23.106.47:8091''172.23.106.48:8091', default bucket
      [2014-07-24 13:31:55,045] - [task:459] WARNING - Not Ready: vb_replica_curr_items 27700 == 30000 expected on '172.23.106.47:8091''172.23.106.48:8091', default bucket
      [2014-07-24 13:32:00,080] - [task:459] WARNING - Not Ready: vb_replica_curr_items 27700 == 30000 expected on '172.23.106.47:8091''172.23.106.48:8091', default bucket
      [2014-07-24 13:32:05,113] - [task:459] WARNING - Not Ready: vb_replica_curr_items 27700 == 30000 expected on '172.23.106.47:8091''172.23.106.48:8091', default bucket

      Logs
      -------------
      will attach cbcollect with xdcr trace logging.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              apiravi Aruna Piravi (Inactive)
              apiravi Aruna Piravi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty