Uploaded image for project: 'Couchbase .NET client library'
  1. Couchbase .NET client library
  2. NCBC-3092

Data node DNS changes are not respected by the SDK

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 3.2.7
    • 3.2.5
    • library
    • Couchbase Server 6.5.1 Enterprise
      Couchbase Autonomous Operator 2.0.2 using PersistentVolumeClaims
      Couchbase .NET SDK 3.2.5
      .NET 6.0 on Linux (also within Kubernetes)
    • 1

    Description

      When performing a Kubernetes rolling upgrade, Kubernetes nodes are drained using `kubectl drain`. This causes the pods on that node to be deleted, and they are then automatically failed over.

      After the failover is complete, the Autonomous Operator creates a new Kubernetes Pod with the same name, mounts the previous volumes, and adds it back to the cluster. The node is identified within Couchbase Server using a DNS name derived from the Pod name. For example, a pod named "couchbase-primary-0062" uses the DNS name "couchbase-primary-0062.couchbase-primary.default.svc".

      However, because a new Pod was made, the Pod will receive a different IP address than the node had originally. Internally within the SDK, name resolution to an IP happens the first time the node is added to the cluster. After that, the IP is being cached without regard to DNS TTL. This means that the SDK consistently fails to reconnect to the node, and the application must be recycled.

      Logs from Couchbase Autonomous Operator during a Kubernetes upgrade:

      {"level":"info","ts":1642506529.9468436,"logger":"cluster","msg":"Pod down, waiting for auto-failover","cluster":"default/couchbase-primary","name":"couchbase-primary-0062","recovery_in":29.835662744}
      {"level":"error","ts":1642506529.9468813,"logger":"cluster","msg":"Reconciliation failed","cluster":"default/couchbase-primary","error":"waiting for pod failover","stacktrace":"github.com/couchbase/couchbase-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:370\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:387\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/pkg/controller/controller.go:86\ngithub.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\ngithub.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
      {"level":"info","ts":1642506530.5100014,"logger":"cluster","msg":"External address collection failed","cluster":"default/couchbase-primary","name":"couchbase-primary-0062"}
      {"level":"info","ts":1642506530.932535,"logger":"couchbaseutil","msg":"Cluster status","cluster":"default/couchbase-primary","balance":"unbalanced","rebalancing":false}
      {"level":"info","ts":1642506530.932574,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0059","version":"enterprise-6.5.1","class":"query-1a-issolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506530.9325805,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0060","version":"enterprise-6.5.1","class":"query-1b-issolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506530.932585,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0061","version":"enterprise-6.5.1","class":"index-1a-issolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506530.932589,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0062","version":"enterprise-6.5.1","class":"data-1c-isolated","managed":true,"status":"failed"}
      {"level":"info","ts":1642506530.932593,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0063","version":"enterprise-6.5.1","class":"data-1b-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506530.9325972,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0064","version":"enterprise-6.5.1","class":"data-1a-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506530.932601,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0065","version":"enterprise-6.5.1","class":"index-1b-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506530.932605,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0066","version":"enterprise-6.5.1","class":"index-1c-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506530.9326172,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0064","class":"data-1a-isolated","group":"us-east-1a"}
      {"level":"info","ts":1642506530.932626,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0063","class":"data-1b-isolated","group":"us-east-1b"}
      {"level":"info","ts":1642506530.9326315,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0061","class":"index-1a-issolated","group":"us-east-1a"}
      {"level":"info","ts":1642506530.9326372,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0065","class":"index-1b-isolated","group":"us-east-1b"}
      {"level":"info","ts":1642506530.932642,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0066","class":"index-1c-isolated","group":"us-east-1c"}
      {"level":"info","ts":1642506530.9326475,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0059","class":"query-1a-issolated","group":"us-east-1a"}
      {"level":"info","ts":1642506530.9326563,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0060","class":"query-1b-issolated","group":"us-east-1b"}
      {"level":"info","ts":1642506532.6439211,"logger":"cluster","msg":"Pods failed over","cluster":"default/couchbase-primary"}
      {"level":"info","ts":1642506532.6530786,"logger":"cluster","msg":"Creating pod","cluster":"default/couchbase-primary","name":"couchbase-primary-0062","image":"couchbase/server:enterprise-6.5.1"}
      {"level":"error","ts":1642506569.7394078,"logger":"cluster","msg":"Reconciliation failed","cluster":"default/couchbase-primary","error":"recovering node http://couchbase-primary-0062.couchbase-primary.default.svc:8091","stacktrace":"github.com/couchbase/couchbase-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:370\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:387\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/pkg/controller/controller.go:86\ngithub.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\ngithub.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/home/couchbase/jenkins/workspace/couchbase-operator-build/goproj/src/github.com/couchbase/couchbase-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
      {"level":"info","ts":1642506570.0897486,"logger":"couchbaseutil","msg":"Cluster status","cluster":"default/couchbase-primary","balance":"unbalanced","rebalancing":false}
      {"level":"info","ts":1642506570.0897832,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0059","version":"enterprise-6.5.1","class":"query-1a-issolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506570.08979,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0060","version":"enterprise-6.5.1","class":"query-1b-issolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506570.0897958,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0061","version":"enterprise-6.5.1","class":"index-1a-issolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506570.089801,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0062","version":"enterprise-6.5.1","class":"data-1c-isolated","managed":true,"status":"add_back"}
      {"level":"info","ts":1642506570.0898066,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0063","version":"enterprise-6.5.1","class":"data-1b-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506570.089811,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0064","version":"enterprise-6.5.1","class":"data-1a-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506570.0898151,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0065","version":"enterprise-6.5.1","class":"index-1b-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506570.0898192,"logger":"couchbaseutil","msg":"Node status","cluster":"default/couchbase-primary","name":"couchbase-primary-0066","version":"enterprise-6.5.1","class":"index-1c-isolated","managed":true,"status":"active"}
      {"level":"info","ts":1642506570.0898309,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0064","class":"data-1a-isolated","group":"us-east-1a"}
      {"level":"info","ts":1642506570.0898392,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0063","class":"data-1b-isolated","group":"us-east-1b"}
      {"level":"info","ts":1642506570.0898435,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0062","class":"data-1c-isolated","group":"us-east-1c"}
      {"level":"info","ts":1642506570.0898473,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0061","class":"index-1a-issolated","group":"us-east-1a"}
      {"level":"info","ts":1642506570.0898511,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0065","class":"index-1b-isolated","group":"us-east-1b"}
      {"level":"info","ts":1642506570.0898552,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0066","class":"index-1c-isolated","group":"us-east-1c"}
      {"level":"info","ts":1642506570.089859,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0059","class":"query-1a-issolated","group":"us-east-1a"}
      {"level":"info","ts":1642506570.089863,"logger":"scheduler","msg":"Scheduler status","cluster":"default/couchbase-primary","name":"couchbase-primary-0060","class":"query-1b-issolated","group":"us-east-1b"}
      {"level":"info","ts":1642506570.3508325,"logger":"cluster","msg":"Marking pod for delta recovery","cluster":"default/couchbase-primary","name":"couchbase-primary-0062"}
      {"level":"info","ts":1642506573.9273438,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506577.9309807,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506581.934769,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506585.9384718,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506589.9420898,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506593.9458678,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506597.9495485,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506601.9534369,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506605.9576228,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506609.9611921,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506613.9646652,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506617.9748719,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506621.978477,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0}
      {"level":"info","ts":1642506625.9887943,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":0.3045808966861605}
      {"level":"info","ts":1642506630.0065265,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":2.710769980506822}
      {"level":"info","ts":1642506634.0232415,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":5.025584795321638}
      {"level":"info","ts":1642506638.02851,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":7.27948343079922}
      {"level":"info","ts":1642506642.0334635,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":9.746588693957115}
      {"level":"info","ts":1642506646.0494561,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":11.66544834307992}
      {"level":"info","ts":1642506650.071835,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":14.31530214424951}
      {"level":"info","ts":1642506654.1107063,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":16.32553606237817}
      {"level":"info","ts":1642506658.1285944,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":18.82309941520468}
      {"level":"info","ts":1642506662.136291,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":20.83333333333333}
      {"level":"info","ts":1642506666.1712596,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":21.2602552169363}
      {"level":"info","ts":1642506670.1908245,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":22.23682144224081}
      {"level":"info","ts":1642506674.2057269,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":23.0440263838185}
      {"level":"info","ts":1642506678.21471,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":23.67010006624883}
      {"level":"info","ts":1642506682.2339375,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":24.86409294418187}
      {"level":"info","ts":1642506686.2450745,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":25.90760338822703}
      {"level":"info","ts":1642506690.2578008,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":26.77845441228642}
      {"level":"info","ts":1642506694.2666256,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":28.47133212279812}
      {"level":"info","ts":1642506698.28025,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":30.43778427550358}
      {"level":"info","ts":1642506702.284427,"logger":"couchbaseutil","msg":"Rebalancing","cluster":"default/couchbase-primary","progress":31.25}
      {"level":"info","ts":1642506706.3642836,"logger":"cluster","msg":"Rebalance completed successfully","cluster":"default/couchbase-primary"}
      {"level":"info","ts":1642506706.5230074,"logger":"cluster","msg":"Reconcile completed","cluster":"default/couchbase-primary"}
      

      Example logs from an application are also attached, filtered to the Coubhase SourceContext. Individual operations are failing with Ambiguous and UnambiguousTimeoutException.

      Attachments

        Activity

          People

            btburnett3 Brant Burnett
            btburnett3 Brant Burnett
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty