Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6939

XDC queue grows and checkpoint commit failures in bi-directional XDCR with front-end workload

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 2.0
    • 2.0
    • XDCR
    • Security Level: Public
    • None
    • 2.0-1856
      Bidirectional replication
      1024 vbuckets
      EC2 centos

    Description

      • Setup a bidirectional replication between two 8:8 clusters on bucket b1.
      • Setup a small front end load on cluster1 and cluster2 , 4K op/sec and 6K ops/sec.
        [Load contains creates, updates, deletes]
      • For the first 40M items, the replication is working as expected, the replication lag is small.
      • Delete the replication from cluster2 to cluster1, recreate the replication.
        [ Expected behaviour - Stop/Start replication.]

      We expect that XDC will stop/start replication with the above step.
      The last committed checkpoint will be checked and replication will continue from the last commited checkpoint.

      Noticing a huge number of gets ~ 30K ops/sec and fewer sets - 2-3k ops/sec on the other cluster.

      -The XDC queue is continuously growing, from < 500k to nearly 7M over a period of 2-3 hours.

      • Seeing continous checkpoint_failures on both the XDC queues.

      The Disk write queue on cluster1, is high ~ 2-3M. The drain rate however is fairly small ~ 30K.

      The items are not drained fast enough and the disk-write-queue is getting filled up faster.

      Adding screenshots from both the clusters.

      The default values currently are -
      XDCR_CHECKPOINT_INTERVAL:300
      XDCR_CAPI_CHECKPOINT_TIMEOUT:10

      @Junyi: I ve stopped the front end load on both the clusters now and I have passed on the cluster access.
      Let me know if you need additional information.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            junyi Junyi Xie (Inactive)
            ketaki Ketaki Gangal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty