Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-31567

5.1.3 CLONE MB-31352 - xdcr replication hang

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 5.1.3
    • 5.5.2
    • XDCR
    • Triaged
    • No

    Description

      I was trying to setup xdcr replication from an in-house cluster to EC2. Following things happened:

      1. Setup XDCR from 4 node(2 data nodes) inhouse cluster to 16 node EC2 cluster ( 8 data nodes) for 1 bucket (msm).

      2. I didn't provide hostname from the EC2 nodes initially. That caused the UI to initially took a long time to respond about what's happening. But after it came back, I tried to delete the replication. There was initially no response and then an error reported on the UI and multiple attempts to delete the replication were unsuccessful.

      3. I restarted the goxdcr process on one data node in source cluster 172.23.97.37. This cleaned up the XDCR replication from the UI.

      Restarted at:
      ns_1@172.23.97.37 6:43:52 PM Tue Sep 18, 2018

      4. I fixed the hostnames on EC2 cluster. Setup the XDCR replication on the source cluster again. This time it started replicating.

      5. After replicating 50% of the data, the progress stopped. I then killed the goxdcr process on the 2nd data node in the source cluster and that kicked off the replication of the remaining 50% of the data.

      Message in logs before restart:

      2018-09-19T04:09:59.411-07:00 INFO GOXDCR.PipelineMgr: Replication Status = map[a8da6785a5cce7dc20c1f861ba93a500/msm/msm:name={a8da6785a5cce7dc20c1f861ba93a500/msm/msm}, status={Pending}, errors={[]}, progress={Pipeline has been stopped}
      

      Restarted at:
      ns_1@172.23.97.38 12:19:26 PM Wed Sep 19, 2018

      Source Cluster Logs:
      https://s3.amazonaws.com/cb-customers/deepkaran/collectinfo-2018-09-19T192037-ns_1%40172.23.97.37.zip
      https://s3.amazonaws.com/cb-customers/deepkaran/collectinfo-2018-09-19T192037-ns_1%40172.23.97.38.zip
      https://s3.amazonaws.com/cb-customers/deepkaran/collectinfo-2018-09-19T192037-ns_1%40172.23.97.39.zip
      https://s3.amazonaws.com/cb-customers/deepkaran/collectinfo-2018-09-19T192037-ns_1%40172.23.97.40.zip

      Let me know if you need destination cluster logs as well.

      Attachments

        Issue Links

          Activity

            People

              yu Yu Sui (Inactive)
              jliang John Liang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty