Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-92

Changing cluster-name in example/couchbase-cluster.yaml file causes issues

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 0.8.0
    • None
    • operator
    • None

    Description

      Changing cluster-name causes the cluster to get into an inconsistent state.

      Please find the steps below to reproduce the issue:

      • Existing Kubernetes cluster.
      • kubectl get pods output shows three running nodes:

      admins-MBP-109:couchbase-operator sindhura.palakodety$ kubectl get po
      NAME                                  READY     STATUS    RESTARTS   AGE
      cb-example-0000                       1/1       Running   0          2m
      cb-example-0001                       1/1       Running   0          1m
      cb-example-0002                       1/1       Running   0          1m
      couchbase-operator-789c895556-sgwrl   1/1       Running   0          2m

      • vi example/couchbase-cluster.yaml
      • Change name from name: cb-example to name: cb-example-1
      • Following errors are seen in the operator logs:

       

      time="2018-01-03T00:06:00Z" level=warning msg="node init: failed with error Code: 0, Error: http://cb-example-1-0000.cb-example-1.default.svc:8091/node/controller/rename - Post http://cb-example-1-0000.cb-example-1.default.svc:8091/node/controller/rename: dial tcp 172.17.0.10:8091: getsockopt: connection refused ...retrying" cluster-name=cb-example-1 module=retryutil
      time="2018-01-03T00:06:01Z" level=info msg="Start reconciling" cluster-name=cb-example module=cluster
      time="2018-01-03T00:06:01Z" level=info msg="Finish reconciling" cluster-name=cb-example module=cluster
      time="2018-01-03T00:06:05Z" level=warning msg="node init: failed with error Code: 0, Error: http://cb-example-1-0000.cb-example-1.default.svc:8091/node/controller/rename - Post http://cb-example-1-0000.cb-example-1.default.svc:8091/node/controller/rename: dial tcp 172.17.0.10:8091: getsockopt: connection refused ...retrying" cluster-name=cb-example-1 module=retryutil
      time="2018-01-03T00:06:21Z" level=warning msg="add node: failed with error Code: 400, Error: error - Failed to reach erlang port mapper. Failed to resolve address for \"cb-example-1-0001.cb-example-1.default.svc\".  The hostname may be incorrect or not resolvable. ...retrying" cluster-name=cb-example-1 module=retryutil
      time="2018-01-03T00:06:25Z" level=info msg="Start reconciling" cluster-name=cb-example module=cluster
      time="2018-01-03T00:06:25Z" level=info msg="Finish reconciling" cluster-name=cb-example module=cluster
      time="2018-01-03T00:06:26Z" level=warning msg="add node: failed with error Code: 400, Error: error - Prepare join failed. Could not connect to \"cb-example-1-0001.cb-example-1.default.svc\" on port 8091.  This could be due to an incorrect host/port combination or a firewall in place between the servers. ...retrying" cluster-name=cb-example-1 module=retryutil
      time="2018-01-03T00:06:31Z" level=warning msg="add node: failed with error Code: 400, Error: error - Prepare join failed. Could not connect to \"cb-example-1-0001.cb-example-1.default.svc\" on port 8091.  This could be due to an incorrect host/port combination or a firewall in place between the servers. ...retrying" cluster-name=cb-example-1 module=retryutil
       
      E0103 00:08:54.168813       5 leaderelection.go:258] Failed to update lock: Put https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: getsockopt: connection refused
      E0103 00:08:54.171115       5 reflector.go:315] github.com/couchbase/couchbase-operator/pkg/controller/controller.go:100: Failed to watch *v1beta1.CouchbaseCluster: Get https://10.96.0.1:443/apis/couchbase.database.couchbase.com/v1beta1/namespaces/default/couchbaseclusters?resourceVersion=49560&timeoutSeconds=544&watch=true: dial tcp 10.96.0.1:443: getsockopt: connection refused
      E0103 00:08:54.183256       5 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: getsockopt: connection refused
      E0103 00:08:54.185375       5 event.go:260] Could not construct reference to: '&v1.Endpoints{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Subsets:[]v1.EndpointSubset(nil)}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'LeaderElection' 'couchbase-operator-789c895556-sgwrl stopped leading'
      

      • Couchbase UI not allowing to login or perform any activity in this state.
      • kubectl shows 6 pods:

       

      admins-MBP-109:couchbase-operator sindhura.palakodety$ kubectl get po
      NAME                                  READY     STATUS    RESTARTS   AGE
      cb-example-0000                       1/1       Running   0          12m
      cb-example-0001                       1/1       Running   0          12m
      cb-example-0002                       1/1       Running   0          12m
      cb-example-1-0000                     1/1       Running   0          9m
      cb-example-1-0001                     1/1       Running   0          8m
      cb-example-1-0002                     1/1       Running   0          8m
      couchbase-operator-789c895556-sgwrl   1/1       Running   1          13m
      admins-MBP-109:couchbase-operator sindhura.palakodety${code}

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mikew Mike Wiederhold [X] (Inactive)
            sindhura.palakodety Sindhura Palakodety (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty