Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-10179

XDCR checkpointing: data loss at destination in cases of destination bucket delete-recreate/flush/failover may be undetected for long periods of time

    XMLWordPrintable

Details

    • Untriaged
    • Sprint 2 - March 11 - April 3

    Description

      Design limitation
      -------------------------
      There's currently a limitation with xdcr in cases of
      1. destination failover
      2. destination bucket flush
      3. destination bucket delete & recreate
      Items at destination may not match those at source. XDCR checkpointing is a mechanism that keeps track of destination vbucket state changes. However checkpointing(even _pre_replicate) for a vbucket does not happen unless the replicator for that vbucket is active although it's well past the time to checkpoint. A vbucket replicator becomes active only when the vbucket receives new mutations. As a result, unless all vbuckets have new mutations, all items will not be replicated. Hence the data loss.

      Destination bucket recreate/flush
      --------------------------------------------------
      Refer MB-10457

      Destination Failover
      -------------------------------

      I faced with the problem when worked with cbrecovery 'negative' scenarios

      But tests can be simplified:
      1)
      cluster1: 10.3.4.144, 10.3.4.145,10.3.4.146
      cluster2: 10.3.4.147, 10.3.4.148,10.3.4.149
      2) load beer-sample on cluster1
      3) create beer-sample on cluster2
      4) create bidirectional XDCR replication cluster1<->cluster2 for bucket "beer-sample"
      5) failover 2 nodes on cluster1 and rebalance

      result:
      data lost on cluster1 (about a third of the data to be lost)
      but we still have bidirectional replication and I expect that data will be restored on cluster1 from cluster2

      I believe that we should be able to automatically recover items from replication for such scenarios.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              apiravi Aruna Piravi (Inactive)
              andreibaranouski Andrei Baranouski
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty