Description
Currently XDCR reconciliation fails if user has manually setup replications with xdcr.managed=false and then proceeds to enable managed xdcr.
The result is that the Operator fetches list of known XDCR replications and tries to look them up in its local cache, but this fails because these replications were never created by the Operator and thus never cached. The error we see is
{"level":"info","ts":1670226478.9425159,"logger":"cluster","msg":"Reconciliation failed","cluster":"tmobile/ttncbk8s","error":"timeout: key error: key xdcr-connection-ttncb-bm-hostname doesn't exist","stack":"github.com/couchbase/couchbase-operator/pkg/cluster/persistence.(*persistentStorageImpl).Get.func1\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/persistence/persistence.go:327\ngithub.com/couchbase/couchbase-operator/pkg/util/retryutil.Retry\n\tgithub.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:14\ngithub.com/couchbase/couchbase-operator/pkg/util/retryutil.RetryFor\n\tgithub.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:30\ngithub.com/couchbase/couchbase-operator/pkg/cluster/persistence.(*persistentStorageImpl).Get\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/persistence/persistence.go:335\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).getPersistentXDCRData\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/xdcr_replication.go:432\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).listRemoteClusters\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/xdcr_replication.go:482\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcileXDCR\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/xdcr_replication.go:729 |
- Stacktrace show's Operator is listing remote clusters (listRemoteClusters)
- That function will return list of replication that Couchbase Server knows about.
- Operator then goes to fetch the replication from persistence secret (getPersistentXDCRData) and fails.
What should happen is the Operator should remove these foreign replications and proceed with reconciliation without failing.