Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49095

[magma, 10TB, 1%]XDCR replication is not moving while the src cluster is in idle state. Items in dstn are more than src.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • 7.1.0
    • 7.1.0
    • XDCR
    • 7.1.0-1460

    Description

      Steps:
      1. Create a 3 node cluster
      2. Create buckets and 50 collections.
      3. Create 40000000 items in each collection:

      Read Start: 0
      Read End: 0
      Update Start: 0
      Update End: 0
      Expiry Start: 0
      Expiry End: 0
      Delete Start: 0
      Delete End: 0
      Create Start: 0
      Create End: 40000000
      Final Start: 0
      Final End: 40000000
      

      4. Update 40000000 keys to create 50 percent fragmentation:

      Read Start: 0
      Read End: 0
      Update Start: 0
      Update End: 40000000
      Expiry Start: 0
      Expiry End: 0
      Delete Start: 0
      Delete End: 0
      Create Start: 0
      Create End: 0
      Final Start: 0
      Final End: 40000000
      

      5. Create another 40000000 items:

      Read Start: 0
      Read End: 0
      Update Start: 0
      Update End: 0
      Expiry Start: 0
      Expiry End: 0
      Delete Start: 0
      Delete End: 0
      Create Start: 40000000
      Create End: 80000000
      Final Start: 40000000
      Final End: 80000000
      

      6. Update 40000000 keys (created in step 5) to maintain 50 percent fragmentation

      Read Start: 0
      Read End: 0
      Update Start: 40000000
      Update End: 80000000
      Expiry Start: 0
      Expiry End: 0
      Delete Start: 0
      Delete End: 0
      Create Start: 0
      Create End: 0
      Final Start: 40000000
      Final End: 80000000
      

      7. Start ASYNC load:

      Read Start: 0
      Read End: 40000000
      Update Start: 0
      Update End: 40000000
      Expiry Start: 0
      Expiry End: 0
      Delete Start: 40000000
      Delete End: 80000000
      Create Start: 80000000
      Create End: 120000000
      Final Start: 80000000
      Final End: 120000000
      

      8. Rebalance IN with Loading of docs in step 7
      9. Rebalance OUT with Loading of docs in step 7
      10. Rebalance SWAP with Loading of docs in step 7
      11. Rebalance IN/OUT with Loading of docs in step 7
      12. Rebalance OUT/IN with Loading of docs in step 7
      13. Validate all docs mutated in step 7. All is well until here.
      14. Repeat the test from step 7-13. After 3 iterations add xdcr dstn cluster.
      15. Repeat the test from step 7-13 with XDCR connected.
      16. All rebalances passed. 2 more iterations passed.
      17. Stopped the test and checked the xdcr destination cluster.
      18. Data in XDCR destination cluster is more that the src cluster and it is may be due to 27B mutations pending. But, there is no more items replication running and mutation pending remain at 27B.
      scr = 5 collections * 80M in each collection = 400M items
      dstn = 595,084,885
      19. Another observation is while there is no mutation on the scr cluster the remaining xdcr mutation are increasing on their own.

      Seeing error in xdcr:

      2021-10-22 02:51:57 172.23.110.67:genericPipeline.RunP2PProtocol:Execution timed out
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty