Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6662

XDC: worse performance with expired items

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: XDCR
    • Security Level: Public
    • Labels:
    • Environment:
      VMs, CentOS 6.2, 4-to-4 nodes, build 1723

      Description

      Unidir symptoms (expiration ratio: 3%, expiration time: 5 minutes):
      – extremely high XDC ops/sec and XDC gets/sec rate on destination side. At the same time actual replication rate (new items/sec) is very low.
      – XDC queue grows all the time on source side (unlike case w/o expired items)

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        pavelpaulau Pavel Paulau created issue -
        steve Steve Yen made changes -
        Field Original Value New Value
        Priority Major [ 3 ] Blocker [ 1 ]
        junyi Junyi Xie (Inactive) made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Duplicate [ 3 ]
        ketaki Ketaki Gangal made changes -
        Resolution Duplicate [ 3 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Assignee Ketaki Gangal [ ketaki ] Junyi Xie [ junyi ]
        pavelpaulau Pavel Paulau made changes -
        Description Unidir symptoms (3% of items expire every 5 minutes):
        -- extremely high XDC ops/sec and XDC gets/sec rate on destination side. At the same time actual replication rate (new items/sec) is very low.
        -- XDC queue grows all the time on source side (unlike case w/o expired items)
        Unidir symptoms (expiration ratio: 3%, expiration time: 5 minutes):
        -- extremely high XDC ops/sec and XDC gets/sec rate on destination side. At the same time actual replication rate (new items/sec) is very low.
        -- XDC queue grows all the time on source side (unlike case w/o expired items)
        junyi Junyi Xie (Inactive) made changes -
        Priority Blocker [ 1 ] Critical [ 2 ]
        junyi Junyi Xie (Inactive) made changes -
        Assignee Junyi Xie [ junyi ] Ketaki Gangal [ ketaki ]
        pavelpaulau Pavel Paulau made changes -
        Attachment debug.tar.gz [ 15055 ]
        junyi Junyi Xie (Inactive) made changes -
        junyi Junyi Xie (Inactive) made changes -
        Priority Critical [ 2 ] Blocker [ 1 ]
        junyi Junyi Xie (Inactive) made changes -
        Assignee Ketaki Gangal [ ketaki ] Chiyoung Seo [ chiyoung ]
        junyi Junyi Xie (Inactive) made changes -
        Comment [ on VMs, set up a 4->4 unidirectional XDCR with about 8M items, on source side, loading it with 2-3K ops/sec using the same workload ( 3% of items expiration time: 5 minutes) in Pavel's test.

        Concurrent mcsoda data loading:

        lib/perf_engines/mcsoda.py 10.3.3.61:11211 vbuckets=1024 doc-gen=0 doc-cache=0 ratio-creates=1 ratio-sets=1 ratio-expirations=0.03 expiration=300 ratio-deletes=0.04 min-value-size=2,3 max-items=20000000 exit-after-creates=0 prefix=XDCR_10
        lib/perf_engines/mcsoda.py 10.3.3.61:11211 vbuckets=1024 doc-gen=0 doc-cache=0 ratio-creates=1 ratio-sets=1 ratio-expirations=0.03 expiration=300 ratio-deletes=0.04 min-value-size=2,3 max-items=20000000 exit-after-creates=0 prefix=XDCR_9


        Observation:
        1. on destination side, the replication rate is stable, around 1-3k which is similar to the testcase w/o expired items. See the screenshot attached.
        Need to figure out how to reproduce it.

        2. on the source side, the XDC queue slowly increase to > 1M items, apparently the replication speed is hard to always catch up with front-end ops.

        3. when compactor is running at dest, stats become choppy, not surprised, we know that.


        ]
        junyi Junyi Xie (Inactive) made changes -
        Comment [ Downgrade the priority until we can reproduce it. ]
        junyi Junyi Xie (Inactive) made changes -
        Comment [ Ketaki, please reproduce it on VM if possible, and point me to the clusters. Thanks. ]
        chiyoung Chiyoung Seo made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        pavelpaulau Pavel Paulau made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            chiyoung Chiyoung Seo
            Reporter:
            pavelpaulau Pavel Paulau
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes