Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-18133

GOXDCR migration service should not fail when encountering error when migrating remote cluster refs and replications

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 4.1.1
    • 4.0.0
    • XDCR
    • Security Level: Public
    • None
    • Untriaged
    • Unknown

    Description

      When clusters are upgraded from 3.x to 4.x, GOXDCR migration service is run to migration remote cluster references and replication specifications to 4.x. If the remote cluster for a remote cluster reference is unreacheable, GOXDCR will get errors when trying to migrate the remote cluster reference and related replications. In the current implementation, when such errors are encountered, the migration service would fail and get re-run by ns_server, with the hope that the errors are temporary, and things may work when migration is re-tried next time. This approach does not work well when the errors are of permanent nature. For example, if some remote clusters have become invalid, which causes migration service to repeatedly fail. As a result, GOXDCR functionalities become inaccessible.

      The only temporary issues that could occur during migration is that remote cluster cluster may become inaccessbile temporarily. To guard against this the migration service can retry remote cluster creation multiple times. Other errors during migration should not be retried.
      The migration service should not fail (i.e., return error status to ns_server to get itself restarted) when it gets such errors. it should simply log the errors, return success status, and let the system proceed.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              yu Yu Sui (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty