Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-19832

prevent XDCR from going into infinite loop when node joins cluster

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 3.1.6, 4.1.2, 4.5.1, 5.0.0
    • 3.1.3
    • ns_server
    • None
    • Untriaged
    • Unknown

    Description

      As seen by a customer.

      Managed to repro by modifying the code so it stores checkpoints at the start of each replication.
      Here's what should happen so the xdcr replicator goes into infinite loop:
      1. Some data for vbucket is replicated
      2. Checkpoint created with latest seqno and UUID
      3. VBucket becomes replica (add another node to the cluster)
      4. Stream closes with status=2, dcp_notifier still remembers last seqno and UUID
      5. xdcr replicator will ping notifier several times without creating a stream, because seqno and UUID match
      6. it will not go into infinite loop, because eventually dcp notifier is getting terminated, die to vbucket becoming a replica
      7. VBucket becomes active (the added node is removed from the cluster)
      8. xdcr replicator is created, it reads seqno and UUID from the checkpoint
      9. xdcr replicator calls dcp notifier and gets immediate reply since the stream is closed and seqno and UUID match
      10. It wakes up, doesn't get any data and then calls dcp notifier again with the same seqno and UUID
      11. infinite loop

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-19832
          # Subject Branch Project Status CR V

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              artem Artem Stemkovski
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty