Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14228

GoXDCR: Should failure_restart_interval still be 30s by default?

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 4.0.0
    • 4.0.0
    • XDCR
    • Security Level: Public

    Description

      failure_restart_interval for all replications is by default 30s. This means that anytime pipeline is broken, xdcr will wait for 30s before fixing it.

      XDCR will need another 30 or more secs to fix the broken pipeline.

      So for every broken pipeline, we will have >1 min of no replication.

      Why should
      1. failure_restart_interval be tunable for every replication? Can this be 0 and made an internal setting?
      2. not xdcr compute the exponential backoff if there is a second failure instead if waiting for 30s?
      3. 30s was default for erlang xdcr(based on the concept of erlang process crash and restarts). Does the same still hold good for goxdcr where the process itself does not crash?

      In any case, despite xdcr crashes being very common in erlang xdcr, I have seldom seen zero replication for a minute or more.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            apiravi Aruna Piravi (Inactive)
            apiravi Aruna Piravi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty