Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-2938

Managed XDCR should remove unknown replications

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 2.5.0
    • None
    • operator
    • None
    • 3 - Uk Sprinting
    • 0

    Description

      Currently XDCR reconciliation fails if user has manually setup replications with xdcr.managed=false and then proceeds to enable managed xdcr.  

      The result is that the Operator fetches list of known XDCR replications and tries to look them up in its local cache, but this fails because these replications were never created by the Operator and thus never cached.  The error we see is

      {"level":"info","ts":1670226478.9425159,"logger":"cluster","msg":"Reconciliation failed","cluster":"tmobile/ttncbk8s","error":"timeout: key error: key xdcr-connection-ttncb-bm-hostname doesn't exist","stack":"github.com/couchbase/couchbase-operator/pkg/cluster/persistence.(*persistentStorageImpl).Get.func1\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/persistence/persistence.go:327\ngithub.com/couchbase/couchbase-operator/pkg/util/retryutil.Retry\n\tgithub.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:14\ngithub.com/couchbase/couchbase-operator/pkg/util/retryutil.RetryFor\n\tgithub.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:30\ngithub.com/couchbase/couchbase-operator/pkg/cluster/persistence.(*persistentStorageImpl).Get\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/persistence/persistence.go:335\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).getPersistentXDCRData\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/xdcr_replication.go:432\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).listRemoteClusters\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/xdcr_replication.go:482\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcileXDCR\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/xdcr_replication.go:729  

      • Stacktrace show's Operator is listing remote clusters (listRemoteClusters)
      • That function will return list of replication that Couchbase Server knows about.
      • Operator then goes to fetch the replication from persistence secret (getPersistentXDCRData) and fails.

       

      What should happen is the Operator should remove these foreign replications and proceed with reconciliation without failing.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            abhi.bose Abhi Bose
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty