Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-2683

Operator cannot Reconcile successfully when CB node is added via UI.

    XMLWordPrintable

Details

    • 1

    Description

      Steps to Reproduce:

      1. Create Operator & DAC pod with 2.3.0 build 279.
      2. Create 2 Couchbase Cluster: cb-example, cb-example-new with all services.
      3. cb-example is a 1 node cluster running 6.6.2
      4. cb-example-new is a 1 node cluster running 7.0.3
      5. Status of the cluster: 

        Prateeks-MacBook-Pro:Downloads prateekkumar$ kubectl get pods
        NAME                                            READY   STATUS    RESTARTS   AGE
        cb-example-0000                                 1/1     Running   0          5m6s
        cb-example-new-0000                             1/1     Running   0          5m6s
        couchbase-operator-5b4cb9f599-dzfbg             1/1     Running   0          6m54s
        couchbase-operator-admission-65469748f6-jd99w   1/1     Running   0          133m 

             6.  Port-forward cb-example-0000 pod to access Couchbase UI.

            7.   Click on the Add Servers option and provide Host name as cb-example-new-srv.default (see image1.png, image3.png) 

            8.  Click on Rebalance Option to rebalance in the new server added above. Rebalance Starts (see image2.png)

            9.  Rebalance completes successfully. Report attached.

            10. Check Operator logs, Error messages is repeated in the cluster: 

      Error formatting macro: code: java.lang.StackOverflowError {"level":"info","ts":1647614098.2377417,"logger":"cluster","msg":"Pod not in the specification, deleting","cluster":"default/cb-example","name":"cb-example-new-srv","class":"unknown"} {"level":"info","ts":1647614098.2377846,"logger":"cluster","msg":"reconciler","clustered":["cb-example-0000"],"running":["cb-example-0000"],"eject":["cb-example-new-srv"],"unclustered":[],"rebalance":true} {"level":"info","ts":1647614098.8129456,"logger":"cluster","msg":"External address collection failed","cluster":"default/cb-example","name":"cb-example-new-srv"}

      {"level":"info","ts":1647614100.0069945,"logger":"cluster","msg":"Reconciliation failed","cluster":"default/cb-example","error":"failed to rebalance: unexpected status code: request failed POST http://cb-example-0000.cb-example.default.svc:8091/controller/rebalance 400 Bad Request:

      {\"mismatch\":1}

      ","stack":"github.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.Client.doRequest\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/core.go:209\ngithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.(*Client).Post\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/core.go:258\ngithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.(*Request).On.func1\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/api.go:220\ngithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.(*Request).On\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/api.go:247\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).rebalance\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/rebalance.go:210\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).handleRebalance\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1036\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).exec\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:307\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcileMembers\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:256\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:170\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:481\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:524\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/controller/controller.go:90\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"}

      {"level":"info","ts":1647614102.217759,"logger":"cluster","msg":"Reconciliation failed","cluster":"default/cb-example-new","error":"member error","stack":"github.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.(*Request).On.func1\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/api.go:217\ngithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.(*Request).On\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/api.go:247\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).getStatus\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/status.go:112\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).GetStatus\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/status.go:99\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).newReconcileMachine\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:206\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:165\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:481\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:524\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/controller/controller.go:90\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"}

      I checked with the Server QE folks who mentioned that when we use the Add Server option, Server will not kick off Rebalance in operation automatically, that is a manual Operation. Hence, Step 8. (Although, we see the error messages even if we don't click on Rebalance option in the UI after Step 7) Logs attached: cbopinfo-20220318T200705+0530.tar.gz

      We see same error in the operator logs even when HostName is given as cb-example-new-0000.cb-example-new.default.svc in the Add Server Option. Logs attached:  cbopinfo-20220318T202359+0530.tar.gz

      We also see the error in the operator logs when 7.0.3(cb-example-new) is added to the 7.0.0(cb-example) node instead of adding 7.0.3 to 6.6.2 node mentioned above. However we get additional "member error" before "failed to rebalance" error:

      {"level":"info","ts":1647615872.9013839,"logger":"cluster","msg":"Reconciliation failed","cluster":"default/cb-example-new","error":"member error","stack":"github.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.(*Request).On.func1\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/api.go:217\ngithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil.(*Request).On\n\tgithub.com/couchbase/couchbase-operator/pkg/util/couchbaseutil/api.go:247\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).getStatus\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/status.go:112\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).GetStatus\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/status.go:99\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).newReconcileMachine\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:206\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:165\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:481\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:524\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/controller/controller.go:90\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"} 

                   Logs attached: cbopinfo-20220318T203738+0530.tar.gz

      Attachments

        1. cbopinfo-20220318T200705+0530.tar.gz
          224 kB
        2. cbopinfo-20220318T202359+0530.tar.gz
          230 kB
        3. cbopinfo-20220318T203738+0530.tar.gz
          221 kB
        4. image1.png
          image1.png
          151 kB
        5. image2.png
          image2.png
          394 kB
        6. image3.png
          image3.png
          385 kB
        7. rebalanceReport.json
          1.45 MB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            simon.murray Simon Murray
            prateek.kumar Prateek Kumar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty