Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-59850

[BP 7.2.4] - XDCR - make backfill pipeline idle detection more intelligent



    • Bug
    • Resolution: Fixed
    • Major
    • 7.2.4
    • 7.6.0, 7.1.4, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3, 7.2.1, 7.1.5, 7.2.2
    • XDCR
    • Untriaged
    • 0
    • Yes


      XDCR's backfill pipeline was designed is such that the following occurs:

      1. Backfill pipeline is created and asks for a set of VBs
      2. Streams created for the subset of VBs to ask from KV DCP
      3. Once a single VB has finished out of the original subset of VBs, XDCR starts a timer
      4. If all the VBs requested originally finished within the timer expiring, things are all good.
      5. If a subset of VBs requested originally did not finish, the timer fires, then the pipeline will restart with the unfinished set of VBs.

      Timer code: https://src.couchbase.org/source/xref/7.2.2/goproj/src/github.com/couchbase/goxdcr/service_impl/through_seqno_tracker_service.go#1121

      MB-57304 in 7.2.2 introduced DCP backfill limit of 64 streams.

      This leads to the fact that only 64 VBs will proceed at once, and the rest of the VBs will not. This means that as soon as 1 VB of the first batch is finished, the timer starts. The assumption that all VBs proceed at the same time is broken.

      The end result is that the timer is too aggressive.

      We should revisit the timeout timer to be more intelligent instead of a blanket timer. For example, maybe the timer can be reset if the number of VBs that are being done is progressing.



        Issue Links

          For Gerrit Dashboard: MB-59850
          # Subject Branch Project Status CR V



              ayush.nayyar Ayush Nayyar
              sumukh.bhat Sumukh Bhat
              0 Vote for this issue
              3 Start watching this issue



                Gerrit Reviews

                  There are no open Gerrit changes